2.1. General Picture
One of nature’s greatest mysteries is how simple bacteria could give rise to such complex creatures as humans. The fundamental law of evolution (natural selection of undirected genetic changes) follows from the ‘vertical’ copying of genetic information (with small random changes). The vertical copying increases the organism’s numbers, thus resulting in competition for resources and natural selection. However, the growth of complexity does not follow from the vertical copying of information. It depends on the second fundamental copying type, which is orthogonal to the first.
The increase in the information amount can occur due to duplication not only of the whole genome, but also of single genes, however, this review considers only the genome-wide duplication. Susumu Ohno was the first to attract attention to the role of WGD in evolution
[13]. In honor of Ohno, the genes arising from the organismal polyploidization are called ‘ohnologs’
[14]. His main idea was that as a result of duplication, an additional gene copy is freed from purifying selection thereby acquiring evolutionary plasticity. He paid special attention to WGD as it completely preserves gene regulatory regions and an initial gene dosage balance, thereby providing a wide foundation for systemic evolutionary experiments. In addition, polyploidization plays a special role in the increase in complexity due to the certain properties of cellular network growth in the case of WGD as compared with the single-gene duplication
[12], which will be discussed below.
Now it is firmly established that there were two rounds of ancient WGD in the vascular plant lineage (390 and 300 Mya) and in the vertebrate lineage between chordates and vertebrates (both about 500 Mya, near Cambrian explosion)
[14]. In teleost fishes, there was a third WGD (300 Mya), while in salmonid fishes, a fourth (95 Mya)
[15,16][15][16]. Furthermore, the more recently formed polyploid strains, species, and above-species lineages are widespread in nature, agriculture, and aquaculture. They are frequent in plants, prokaryotes, protists, fungi, invertebrates, and ectotherm vertebrates
[5,17,18][5][17][18]. In vertebrates, the record should be placed in the Xenopus genus where 26 of 27 species are polyploids, and in sturgeons where dodecaploid species with 380 chromosomes were found
[19,20][19][20]. In reptiles, there are only triploid species (and they are very rare), which propagate by parthenogenesis
[21]. In birds and mammals, polyploids can also arise after conception, yet die in early development
[22]. One rodent species is sometimes mentioned as polyploid, however, this attribution is probably incorrect
[23]. Notably, certain model organisms are lineage-specific polyploids (yeast, zebrafish, xenopus, Arabidopsis).
2.2. Cellular and Molecular Aspects
One of the main problems, which new polyploids encounter, is the disturbance of chromosome pairing in meiosis. This problem is alleviated in the case of allopolyploids where polyploidization occurs due to the hybridization of two related species. In allopolyploids, the chromosomes of parental species differ and therefore conjugate separately. Probably as a result of this feature, allopolyploids are more frequently established, albeit more rarely, compared to autopolyploids
[18,24][18][24]. Polyploidization is followed by rediploidization, i.e., the process of converting a polyploid chromosome set into a diploid one. Rediploidization does not mean the reversal to the original diploid state but only the re-establishment of regular chromosome pairing due to chromosome rearrangement and DNA divergence
[16,20][16][20]. This process is accompanied by gene loss (due to deletion or pseudogenization), subfunctionalization, and neofunctionalization
[25,26][25][26]. Subfunctionalization is a partitioning of an initial gene function (e.g., making it tissue-/development stage-/condition-specific), whereas neofunctionalization is an acquisition of a novel function. Subfunctionalization and neofunctionalization lead to gene diversification, which should be accompanied by an integration of the work of a growing number of diversifying genes, i.e., an increase in regulatory complexity
[27].
In genome-wide studies, subfunctionalization can be detected by the changes in gene expression and the properties of the interactome. For example, in the yeast interactome, the proteins encoded by ohnologs have a lower number of protein interactions compared to the single-gene duplicates and genes without duplicates (i.e., genes that lost the second ohnolog)
[28]. The loss of interactions was proportional to their initial number and independent of the ohnolog position in the protein interaction network. The functional analysis of the overlapping and non-overlapping interactants in each pair of ohnologs revealed a sharp asymmetry in the regulatory functions. The regulatory functions occur more frequently in the non-overlapping interactants. These facts suggest that subfunctionalization was a prevailing trend in the evolution of retained ohnologs, and was accompanied by regulatory specialization.
The increase in regulatory complexity and elaboration of signal transduction systems is a hallmark of post-WGD evolution
[29]. The predominant retention of regulatory and developmental genes subsequent to WGD can take place as relaxed selection allowed duplicated networks to be rewired, and to evolve novel functionality, thereby increasing biological complexity
[5]. In the human interactome, the more ancient genes have a higher local and global centrality compared with newer genes, which indicates a gradual core-to-periphery evolutionary growth of the interactome
[12]. The local centrality is the number of direct interactions of a protein (degree), while the global centrality is estimated by the number of protein-interaction paths going through a protein in the whole network (betweenness) or by the average path length to other proteins (closeness). However, there is a remarkable exception: the novel genes did not decline in the network centrality during the expansion of the multicellular organization from Bilateria to Euteleostomi (bony vertebrates)
[12]. This plateau of interactome centrality is related to the series of whole-genome duplication. It indicates that in contrast to single-gene duplicates, which provide a gradual core-to-periphery network growth, in the case of WGD, the network core (that is richest in interactions) is also duplicated. Thus, the interactome complexity becomes higher in the case of growth by means of WGD, as compared with the growth by single-gene duplicates. The great diversity of vertebrates, which appeared after the two WGD rounds followed by sub- and neofunctionalization of ohnologs, can be considered as an extension of Cambrian explosion in the vertebrate lineage (reflected in the plateau of interactome centrality).
The human ohnologs are most strongly enriched in the bivalent genes and the genes related to development, neuronal membrane, and chromatin (
Figure 1 and
Figure 2). The bivalent genes, which have both activating and repressive epigenetic marks in their promoters and can switch on/off quickly, are key regulators of cellular networks
[30,31][30][31]. These facts emphasize the important role of ohnologs in regulation.
Figure 1. The enrichment of human ohnologs in bivalent genes and genes involved in multicellular organism development GO:0007275. Bivalent:
p < 10
−151, development:
p < 10
−67. The ohnologs (strict) were from
[14]. The bivalent genes (Fantom-confirmed) were from
[30]. The enrichment analysis was conducted as in
[32].
Figure 2. The enrichment of pre-WGD and post-WGD human ohnologs in functional gene groups. (
A,
B)—nuclear chromatin GO:0000790 and synapse GO:0045202. Chromatin: underrepresentation
p < 0.01 in pre-WGD, enrichment
p < 10
−61 in post-WGD. Synapse: enrichment
p < 10
−71 in pre-WGD; enrichment
p < 10
−8 in post-WGD. (
C,
D)—transcription factors (TF) and ion transmembrane transporter activity GO:0015075 (ITT). TF: underrepresentation
p < 10
−7 in pre-WGD, enrichment
p < 10
−74 in post-WGD. ITT: enrichment
p < 10
−56 in pre-WGD, not significant
p > 0.4 in post-WGD. Albeit both pre-WGD and post-WGD ohnologs are enriched in the synapse genes, there are significant differences in binomial proportions between pre-WGD and post-WGD ohnologs (
p < 10
−15). The pre-WGD and post-WGD ohnologs were from
[33]. The transcription factors were from
[34].
The ohnologs present a unique model for the study of evolutionary speed in different functional gene groups (as WGD presents the common starting point). Albeit all ohnologs are of pre-WGD origin by definition, it is possible to distinguish those of them, which experienced an accelerated sub/neofunctionalization after the WGD, using the method of shallow phylostratigraphy (gene dating)
[33]. These genes rapidly accumulated changes and can be mapped only to the post-WGD phylostrata by shallow phylostratigraphy. In other words, as a result of the rapid evolution, they lost strict orthology with the genes from phylogenetic lines branching before the WGD and retained it only with the genes from the post-WGD branches (they were called ‘post-WGD’ ohnologs).
Importantly, the synapse and the chromatin are the most enriched Cellular Component Gene Ontology categories in the pre-WGD and post-WGD ohnologs, respectively. Thus, ohnologs are involved in both regulatory levels of the organism: the nucleome and the connectome. The pre-WGD ohnologs show enrichment in the synapse genes and underrepresentation in the chromatin genes, whereas the post-WGD ohnologs show a stronger enrichment in the chromatin genes than in the synapse genes, with significant difference in these enrichments (Figure 2A,B). The most enriched GO Molecular Function in the pre-WGD ohnologs is the ion transmembrane transporter activity, whereas in the post-WGD ohnologs it is the transcription factors (Figure 2C,D). Thus, nuclear regulome shows a faster evolution than the molecular basis of neuronal signal transmission. Possibly, this is due to the main regulatory task in the nervous system being shifted from the molecular to the cellular level, and this shift can restrict the evolution of involved genes.
In addition to the increase in complexity, polyploids show enhanced evolutionary plasticity
[35]. Large-scale species radiation in a given taxonomic group typically follows polyploidization events, indicating the potential of polyploidization for speciation
[36]. However, whatever evolutionary perspectives of polyploids may be, natural selection is not teleological and works only for immediate adaptation. Polyploids show a higher resistance to environmental stress and diseases
[37,38][37][38]. They are especially advantageous under dramatically challenging environments
[39]. Thus, many ancient polyploidization events were concentrated near crucial historical periods, such as the Cretaceous–Paleogene boundary, when an asteroid hit the Earth resulting in the drastic alteration of climate and mass extinction
[36].
2.3. Polyploidy in Agriculture and Aquaculture Biotechnology
Polyploidy was a key factor in the domestication of crops
[26,40,41][26][40][41]. Domesticated plants have gone through more polyploidization events than their wild relatives
[40]. Genetic plasticity of the polyploid genome and multicopy genes present a special advantage for domestication
[36]. Among lineage-specific polyploids are such major crops as wheat, oat, maize, cotton, potato, legumes, banana, sugarcane, oilseed rape, strawberry, coffee, mustard, tobacco. The domesticated organisms are mostly allopolyploids (i.e., the results of polyploidization associated with hybridization). Hybridization is frequently accompanied by enhanced heterozygosity and hybrid vigor, while polyploidization restores the fertility of newly formed hybrids and contributes to the stabilization of the hybrid genome, fixing both heterozygosity and new hybrid characters
[41].
Allopolyploidization presents a way for new organisms’ formation. Synthetic polyploids have been employed to increase beneficial traits, such as higher fitness, disease resistance, faster growth, and larger production compared to their natural diploids
[42,43][42][43]. Thus, the synthetic hexaploid wheat presents a novel source of genetic diversity for multiple biotic stress resistance
[38]. Many artificial polyploids are utilized commercially in aquaculture and most of them were created from natural polyploid fishes and shellfishes, especially cyprinids and salmonids
[44]. There is a competition between parental genomes in polyploid hybrids, called ‘subgenome dominance’, when the less expressed subgenome tends to delete more genes and cis-acting elements compared to the counterpart subgenome
[26,41][26][41]. In extreme cases, the parental genomes can even segregate separately in germ cell lineage and a hybrid state arises de novo in each generation
[45,46][45][46]. The recursive production of de novo polyploid hybrids is used when allopolyploids are sterile or lose their hybrid vigor through inbreeding depression. For instance, an allohexaploid carp is massively propagated by crossing parental species in each generation
[44].