2. Seeking Sense in the Hox Gene Cluster
2.1. Seeking Sense in the Evolutionary Origin of Hox Clustering and Transcriptional Direction
Clustering arose due to gene duplication from an original proto-Hox gene. Duplication occurs due to unequal cross-over of DNA during meiosis in germ cells. This may be due to a cross-over between repeat elements in the DNA, or between other regions of a similar genetic sequence. It results in one chromosome becoming deleted in a genetic fragment—such as a Hox gene in our case—while its attached homologue incorporates the extra fragment.
This process, called “tandem gene duplication” leaves one chromosome with both the maternal- and paternal-derived copies of the Hox gene lying in tandem on the same DNA strand, and in the same orientation
[14][15][20,21]. If this expanded chromosome contributes to a zygote, then the additional Hox copy offers new scope for evolutionary change. For example, if it acquires a new anterior boundary of expression and function, due to a change in regulation or mutation, then this can specify a new zone of development along the head–tail axis. It is usually suggested that this re-purposing of the new Hox gene occurs after the duplication event
[15][21]. However, it has been proposed that two identical Hox genes may provide a selective disadvantage and a more reasonable hypothesis might be that the genes were initially alleles of the same gene that already possessed some useful differences in expression and/or function
[16][22].
Repeated cycles of tandem gene duplication then led to the growing Hox gene cluster, permitting ever-increasing complexity of body structures along the head–tail axis. The process neatly explains not only why the Hox genes were formed in clusters, but also why they transcribe in the same direction. The cluster developed with spatial collinearity, and its polarity (that is, whether anterior-expressed genes lie upstream or downstream of posterior genes) was likely established with the initial duplication/re-purposing event
[17][23].
Hox genes are members of the ANTP class of homeobox-containing genes. This also includes developmental gene groups paraHox, Dlx and NK, and all are thought to have arisen by tandem duplication from the same ANTP class proto-Hox gene
[18][24]. Although paraHox, Dlx and NK genes are now usually dispersed from the Hox cluster some of these remain linked to the Hox cluster in at least some protostome and deuterostome species
[18][19][20][21][24,25,26,27]. Apart from the ANTP class genes, there are ten other classes of homeobox genes
[18][24], such as PRD, POU and LIM, and all classes likely arose by tandem gene duplication from an original proto-homeobox gene.
The question then arises as to when these events took place.
Figure 4 indicates the origin of homeobox genes within the tree of life. There are no ANTP class genes outside the metazoa
[18][24]. However, duplication and diversification of ANTP class genes must have occurred very early in pre-bilaterian evolution because at least some poriferans (sponges) possess both NK
[18][24] and paraHox genes
[22][28] even though they lack Hox genes, which are presumed to have been secondarily lost
[22][28]. Cnidarians possess Hox, paraHox and NK genes
[18], but there is debate over whether cnidarian Hox genes are strict orthologues of bilaterian Hox genes
[23], and whether they show spatial collinearity
[5](Section 2.9). Proceeding backwards in time, the greatest proliferation of homeobox genes occurred with the advent of multicellularity in animals (metazoa), and also in plants and fungi (
Figure 4)
[24][29]. Bacteria do not have homeobox genes. However, they do have helix-loop-helix genes which have similarities in structure and function, and which may have been the ancestral progenitors of homeobox genes
[24][29].
Figure 4. Origins of homeobox genes within the tree of life. Branching orders of Porifera relative to Cnidaria, and Xenacoelomorpha relative to protostomes and deuterostomes are uncertain. LUCA, last universal common ancestor; Ch-MLCA, choanoflagellate-metazoan last common ancestor; Cn-BLCA cnidarian–bilaterian last common ancestor; P-DLCA, protostome–deuterostome last common ancestor; MYA, million years ago. Figure from Gaunt 2019
[6].
Overall, the distribution of homeobox and Hox genes throughout the tree of life makes sense in terms of the progressive development over time of the Hox gene cluster.
2.2. Seeking Sense in the Maintenance and Compactness of Clusters
2.2. Seeking Sense in the Maintenance and Compactness of Clusters
Clusters are not always maintained. Some species may develop splits in the cluster, e.g.,
Drosophila where splits occur at different positions in separate sub-species
[25][30]. However, the large regulatory regions between insect Hox genes probably limit the number of places where clusters can be broken without affecting gene function
[26][31]. Other species, such as the urochordate
Oikiopleura, show complete dispersal of the Hox cluster throughout the genome though, remarkably, the genes continue to show spatial collinearity in expression relative to the position that they occupied in the ancestral cluster
[27][19].
Most species, however, continue to maintain the clustering of their Hox genes, at least to some extent. One explanation for this is enhancer sharing between Hox genes. For example, the
iab-5 regulatory region in
Drosophila apparently regulates both
abd-A and
Abd-B [28][32]. Similarly, in mice, the CR3 enhancer regulates both
Hoxb4 and
Hoxb3 to become expressed up to the level of rhombomere 6/7
[29][33], and a separate Kreisler element then also drives
Hoxb3 expression in rhombomere 5
[30][34]. These are examples of local enhancer sharing, where the enhancer lies within the Hox gene cluster.
In vertebrates, there are also long-range enhancers, usually located outside the Hox gene cluster, which can loop-in to regulate multiple Hox genes in a tissue-specific way
[31][32][35,36]. Long-range enhancers, seen in vertebrates, encourage compaction and exclusion of other, non-Hox, genes. Compact clusters of vertebrates (
Figure 5) are likely a derived rather than ancestral condition
[32][36].
Figure 5. Hox clusters are more compact in vertebrates than in non-vertebrates. Clusters shown are all apparently intact. Sizes shown are from:
Ixodes [33][37];
Parhyale [34][38];
Apis [35][39];
Anopheles [33][37];
Nasonia [33][37];
Tribolium [36][40];
Saccoglossus [37][41]; sea urchin
[38][42]; amphioxus
[32][36]; mouse
[33][37].
Other proposed reasons to maintain clustering are provided by the chromatin opening model, the Hox conjecture, both discussed below, and by the supposition that transcriptional activation and repression may each work best on nearby genes, as discussed in
SectionSection 2.3. 2.3. Species that have lost their clustering have presumably overcome the above requirements.
Transcriptional direction within Hox clusters is usually maintained, but the
Dfd gene in
Drosophila provides an example of gene inversion from the ancestral state (
Figure 1). Inversion is normally selected against because the promoter of the inverted gene may fall under the regulation of neighbour gene enhancers, with serious consequences for development. For example, the well-known dominant
Drosophila Antennapedia homeotic mutation in which legs develop instead of antennae is usually due to inversions in the
Antp gene, causing it to become ectopically expressed by the enhancers of upstream, more anteriorly-expressed genes
[39][40][43,44]. Darbellay et al.
[41] [45] inverted mouse
Hoxd11 and
d12, including displacement of a CTCF insulator element that normally separates them from
Hoxd13. This caused an anterior shift in expression from the
Hoxd13 locus, possibly due to its misregulation from a
Hoxd11/d12, or long-range, enhancer. These authors also describe a
Hoxd11 inversion that robustly suppressed expression from its
Hoxd12 neighbour, and they propose that this may be due to the collision of transcription units on opposing DNA strands.
2.3. Seeking Sense in Spatial and Temporal Collinearities
Spatial collinearity describes the correspondence between the order of genes along the cluster (3′ to 5′) and the order of their expression domains
(anterior-to-posterior) along the developing embryo. Temporal collinearity describes a correspondence between the order of genes along the cluster (3′ to 5′) and the time of their first expression (early to late) in the embryo.
In addition to spatial collinearity in Hox expression, found to some extent in most bilaterians [5][6], many species including vertebrates [42], cephalochordates [43], some annelids [44] and some arthropods [45] display temporal collinearity.
Many embryo types, including vertebrates, grow by successive addition of new parts at their posterior ends. That is, they develop in a head-to-tail temporal sequence from a posterior growth zone, and each new zone moves its overall Hox expression one step down in the pattern shown in
Figure 2A
-, right. The “chromatin opening” model was developed principally for embryos that develop from a posterior growth zone. It proposes that genes are expressed by progressive, timed, 3′ to 5′ opening of Hox cluster chromatin structure that permits and regulates the expression of Hox genes in their anterior-to-posterior temporal sequence (temporal collinearity)
[46]. This means that temporal collinearity specifies the need for, and is dependent upon, spatial collinearity. If this was so ancestrally, then
itwe can
be likely predict
ed that the ancestral bilaterian already had temporal collinearity
[47] and developed its body axis from a posterior growth zone
[48].
While it is clear that there is indeed a progressive opening of the Hox cluster as gastrulation proceeds
[9], it is not obvious whether this is a cause or a consequence of Hox gene activation. Some experimental evidence against the chromatin opening model has been reviewed earlier
[8]. More recent work in mouse embryonic stem cell aggregates
[49] indicates that CTCF-binding insulator elements are successively breached along the Hoxd cluster, with accompanying, progressive chromatin extrusion and change in Hox gene expression. This process in itself suggests a need for some level of spatial collinearity and would be in keeping with the chromatin opening model. However, the authors note that spatial and temporal collinearities persist even after deletion of these insulator elements, and they conclude that insulators regulate the pace and precision of collinearities rather than their organization
[49][50] [49,50]. Duboule’s Hox Conjecture
[51] [51] proposes that for reasons yet unclear, but not necessarily dependent upon chromatin opening, spatial collinearity may still be necessary to achieve temporal collinearity in Hox expression. Evidence for this is largely circumstantial and is that species displaying temporal collinearity have so far been found to have an intact, unbroken Hox gene cluster. Studies on the annelid
Urechis unicinctus perhaps provide some evidence against this
[52]. Its posterior Hox genes (
Lox2 and
Post2) are separated from more anterior genes (
Hox1 to
Lox4) by greater than 1Mb, as sub-clusters, yet the genes as a whole display spatial and temporal collinearities
[52]. Conversely, however, an intact cluster need not necessarily imply temporal collinearity. For example, the hemichordate
Saccoglossus [37][41] has an intact cluster, but apparently does not show temporal collinearity
[53] [53].
An alternative “gene segregation” model
[8] was proposed to account for why spatial collinearity is largely maintained even in species not showing either temporal collinearity or posterior growth zone, including perhaps the ancestral bilaterian. It could also be called the “single boundary” model. In this model, spatial collinearity is the primary feature and temporal collinearity follows as a necessary consequence in animals that develop from a posterior growth zone. This is so in these animals because anterior parts form first due to the earliest expression of anterior Hox genes, which are 3′-located. Posterior parts form last due to the late expression of posterior Hox genes, which are 5′-located. Overall, spatial collinearity thereby gives rise to temporal collinearity. The model does not dispute the importance of timing in Hox gene activation, especially so in species that develop from a posterior growth zone, but it does suggest that this may not depend primarily upon spatial collinearity, as it does in the chromatin opening model.
Comparison of
Figure 2A with
Figure 2B shows how spatial collinearity (
Figure 2A) provides the minimum number (one) of boundaries between expressible and non-expressible Hox genes within the cluster at all positions along the body (
Figure 3). The single boundary model proposes that the bilaterian cluster was established and maintained with spatial collinearity because this arrangement allows a number of potential benefits: see
[8] and references therein, and now briefly summarized as follows. (i) Minimal boundaries mean the minimum risk that gene repressive factors, e.g., Pc protein, will inadvertently spread from inactive to active genes; and also, a minimum risk that gene activating factors, e.g., Trx protein, will spread from active to inactive genes. (ii) All active genes can more readily be grouped together in a nuclear transcription factory where there may be higher local concentrations of transcription and other factors necessary for gene expression. Additionally, all non-expressible genes can be grouped together in a nuclear Pc body where there may be higher local concentrations of repressive factors. (iii) Segregated active and inactive genes are less likely to interfere sterically with each other in their movement to transcription factories and Pc bodies. (iv) Contiguity of repressed Hox genes allows potential cross-spreading between them of repressive components.
Hox gene collinearity, well known to be ubiquitous in bilaterians, therefore remains an enigmatic feature of Hox gene expression. T
wo models which
attempt to make sense of it are reviewed here. The ane ancestral bilaterian cluster was likely to have had an incompact Hox cluster, with vertebrate clusters having since become compacted
[32][36].
Vertebrate Hox genes, perhaps becoming mutually more entangled in their regulation, might therefore not provide the ideal system to make sense of the ancestral state. There remains a need to investigate collinearity in a wide range of species.