Seeking Sense in the Hox Gene Cluster

Seeking Sense in the Hox Gene Cluster: Comparison

Please note this is a comparison between Version 1 by Stephen Gaunt and Version 4 by Stephen Gaunt.

The Hox gene cluster, responsible for patterning of the head–tail axis, is an ancestral feature of all bilaterally symmetrical animals (the Bilateria) that remains intact in a wide range of species.

Hox cluster
collinearity
evolution
axial morphology
Bilateria

1. Introduction

An unexpected discovery in the 1980s was that both arthropods (Drosophila) and vertebrates (mice) utilise conserved clusters of Hox genes in order to specify pattern formation along their head–tail axes. The widely held conclusion is that the common ancestor to these two groups (the protostome–deuterostome last common ancestor, or P-DLCA) already possessed a cluster of seven, or more, Hox genes which it used, most likely as in its descendants today, for specification of distinct body parts along the anterior-to-posterior (A-P) axis (Figure 1) ^[1][2][3][1,2,3].

Figure 1. Homologous Hox gene clusters in Drosophila, mouse/human and, by inference, their common ancestor. The ancestor (the protostome–deuterostome last common ancestor, P-DLCA) may have had more than the seven genes shown here [4]. Genes that share the same numbers and shading intensities are most recently related by descent. Hox-derived genes in Drosophila that no longer function as true Hox genes are named in grey text. Arrows indicate directions of transcription (presumed for ancestor). ANT-C, Antennapedia complex; BX-C, Bithorax complex.

The Hox cluster is an ancestral feature of all bilaterally symmetrical animals (Bilateria) retained in many, though not all, species. It can be said that it evolved successfully only once since the cluster is the same in most groups, with We can say that it evolved successfully only once since the cluster is the same in all groups, with labial-like genes at one end of the cluster expressed in the anterior embryo, and Abd-B-like genes at the other end of the cluster expressed posteriorly ^[4][5][6]. Although now disrupted in some descendants, the Hox gene cluster in many others, both protostome and deuterostome, is still conserved to a large extent in its inferred ancestral form ^[4][5][6]. The entire cluster has undergone duplications in vertebrates (Figure 1) though it remains unduplicated in invertebrate deuterostomes ^[4][5][6].-like genes at the other end of the cluster expressed posteriorly.

From his pioneering analyses of

Drosophila

developmental mutants, Ed Lewis [7] was the first to propose the following

(Figure 2A): that the clustered set of genes now known as Hox genes are expressed in a series of partially overlapping domains along the length of the embryo; that the order of the genes along the chromosome corresponds with the order of their expression domains along the head–tail axis (the spatial collinearity rule); and that the blend of Hox genes active in each segment or A-P domain of the body is responsible for the specification of its structure. Subsequent work showed that these predictions hold true, at least in part, for many or most bilaterally symmetrical animals ^[5][6].

: that the clustered set of genes now known as Hox genes are expressed in a series of partially overlapping domains along the length of the embryo; that the order of the genes along the chromosome corresponds with the order of their expression domains along the head–tail axis (the spatial collinearity rule); and that the blend of Hox genes active in each segment or A-P domain of the body is responsible for the specification of its structure. Subsequent work showed that these predictions hold true for many or most bilaterally symmetrical animals (Sections 2.4, 2.7 and 2.8), with some conditions. [Please make underlined here a link to the next paragraph in original manuscript, i.e., paragraph beginning Lewis's model describes .....]

Figure 2. Hox expression patterns which do, or do not, conform to Lewis’s collinearity model. (A) Conforming to Lewis’s model for Drosophila ^[7], and as now known to apply in many different animals ^[5][6], Hox genes are expressed in a series of partially overlapping domains along the body with the order of genes along the cluster being collinear with the order of their expression domains along the head–tail axis. This is shown in Figure 2A-left, and the genes active at different positions along the body are shown in Figure 2A-right. () Conforming to Lewis’s model for Drosophila [7], and as now known to apply in many different animals [5,6], Hox genes are expressed in a series of partially overlapping domains along the body with the order of genes along the cluster being collinear with the order of their expression domains along the head–tail axis. This is shown in Figure 2A-left, and the genes active at different positions along the body are shown in Figure 2A-right. (B) Scenario where Hox3 and Hox5 gene expressions do not conform to collinearity. Arrows show directions of transcription. Figure from Gaunt 2015 [8].

Lewis’s model describes Hox genes as expressing or non-expressing, but researchers now understand that these states are more accurately described as, respectively, expressible or non-expressible. Recent studies in both mice ^[9][10] and Drosophila ^[11] show that the expressible Hox genes in a given cell (Figure 2A-right) are in an open chromatin state, characterized by Trithorax (Trx) protein binding, while the non-expressible genes are in a closed chromatin state, characterized by Polycomb (Pc) protein binding (Figure 3). These chromatin states are usually heritable from one cell generation to the next, thereby ensuring that Hox expressibility patterns acquired in the early embryo are faithfully remembered at all later stages, enabling guidance throughout the course of development. At any region along the body, there is typically only one boundary between expressible and non-expressible genes in both mice ^[9][10] (Figure 2A-right and Figure 3) and Drosophila ^[11]. In the subtle difference from Lewis’s original proposal, expressible genes need not always be expressed. Expression depends upon the availability of activating or repressive transcription factors and may change in a tissue over developmental time. Expressible Hox genes have been described as “open for business” ^[12][13]. Regions of the body where Hox genes are non-expressible (closed for business; red zones in Figure 3) typically cannot express these genes at any time.

Jdb 10 00048 g003

Figure 3. Discreet domains of open and closed chromatin generally support Lewis’s model. Antibody studies show correspondence between the position of cells along the head–tail axis and the distributions of Hox genes between open (Hox expressible, green domain; Trithorax-rich) and closed (Hox non-expressible, red domain; Polycomb-rich) chromatin states. At each level along the body, there is only a single boundary between these states, supporting Lewis’s model (Figure 2A-right). Re-drawn/modified from Noordermeer et al. ^[10].

2. Seeking Sense in the Hox Gene Cluster

2.1. Seeking Sense in the Evolutionary Origin of Hox Clustering and Transcriptional Direction

Clustering arose due to gene duplication from an original proto-Hox gene. Duplication occurs due to unequal cross-over of DNA during meiosis in germ cells. This may be due to a cross-over between repeat elements in the DNA, or between other regions of a similar genetic sequence. It results in one chromosome becoming deleted in a genetic fragment—such as a Hox gene in our case—while its attached homologue incorporates the extra fragment. This process, called “tandem gene duplication” leaves one chromosome with both the maternal- and paternal-derived copies of the Hox gene lying in tandem on the same DNA strand, and in the same orientation ^[14][15][20,21]. If this expanded chromosome contributes to a zygote, then the additional Hox copy offers new scope for evolutionary change. For example, if it acquires a new anterior boundary of expression and function, due to a change in regulation or mutation, then this can specify a new zone of development along the head–tail axis. It is usually suggested that this re-purposing of the new Hox gene occurs after the duplication event ^[15][21]. However, it has been proposed that two identical Hox genes may provide a selective disadvantage and a more reasonable hypothesis might be that the genes were initially alleles of the same gene that already possessed some useful differences in expression and/or function ^[16][22]. Repeated cycles of tandem gene duplication then led to the growing Hox gene cluster, permitting ever-increasing complexity of body structures along the head–tail axis. The process neatly explains not only why the Hox genes were formed in clusters, but also why they transcribe in the same direction. The cluster developed with spatial collinearity, and its polarity (that is, whether anterior-expressed genes lie upstream or downstream of posterior genes) was likely established with the initial duplication/re-purposing event ^[17][23]. Hox genes are members of the ANTP class of homeobox-containing genes. This also includes developmental gene groups paraHox, Dlx and NK, and all are thought to have arisen by tandem duplication from the same ANTP class proto-Hox gene ^[18][24]. Although paraHox, Dlx and NK genes are now usually dispersed from the Hox cluster some of these remain linked to the Hox cluster in at least some protostome and deuterostome species ^{[18][19][20][21]}[24,25,26,27]. Apart from the ANTP class genes, there are ten other classes of homeobox genes ^[18][24], such as PRD, POU and LIM, and all classes likely arose by tandem gene duplication from an original proto-homeobox gene. The question then arises as to when these events took place. Figure 4 indicates the origin of homeobox genes within the tree of life. There are no ANTP class genes outside the metazoa ^[18][24]. However, duplication and diversification of ANTP class genes must have occurred very early in pre-bilaterian evolution because at least some poriferans (sponges) possess both NK ^[18][24] and paraHox genes ^[22][28] even though they lack Hox genes, which are presumed to have been secondarily lost ^[22][28]. Cnidarians possess Hox, paraHox and NK genes ^[18], but there is debate over whether cnidarian Hox genes are strict orthologues of bilaterian Hox genes ^[23], and whether they show spatial collinearity ^[5](Section 2.9). Proceeding backwards in time, the greatest proliferation of homeobox genes occurred with the advent of multicellularity in animals (metazoa), and also in plants and fungi (Figure 4) ^[24][29]. Bacteria do not have homeobox genes. However, they do have helix-loop-helix genes which have similarities in structure and function, and which may have been the ancestral progenitors of homeobox genes ^[24][29].

Figure 4. Origins of homeobox genes within the tree of life. Branching orders of Porifera relative to Cnidaria, and Xenacoelomorpha relative to protostomes and deuterostomes are uncertain. LUCA, last universal common ancestor; Ch-MLCA, choanoflagellate-metazoan last common ancestor; Cn-BLCA cnidarian–bilaterian last common ancestor; P-DLCA, protostome–deuterostome last common ancestor; MYA, million years ago. Figure from Gaunt 2019 [6].

Overall, the distribution of homeobox and Hox genes throughout the tree of life makes sense in terms of the progressive development over time of the Hox gene cluster.

2.2. Seeking Sense in the Maintenance and Compactness of Clusters

Clusters are not always maintained. Some species may develop splits in the cluster, e.g., Drosophila where splits occur at different positions in separate sub-species ^[25][30]. However, the large regulatory regions between insect Hox genes probably limit the number of places where clusters can be broken without affecting gene function ^[26][31]. Other species, such as the urochordate Oikiopleura, show complete dispersal of the Hox cluster throughout the genome though, remarkably, the genes continue to show spatial collinearity in expression relative to the position that they occupied in the ancestral cluster ^[27][19]. Most species, however, continue to maintain the clustering of their Hox genes, at least to some extent. One explanation for this is enhancer sharing between Hox genes. For example, the iab-5 regulatory region in Drosophila apparently regulates both abd-A and Abd-B ^[28][32]. Similarly, in mice, the CR3 enhancer regulates both Hoxb4 and Hoxb3 to become expressed up to the level of rhombomere 6/7 ^[29][33], and a separate Kreisler element then also drives Hoxb3 expression in rhombomere 5 ^[30][34]. These are examples of local enhancer sharing, where the enhancer lies within the Hox gene cluster. In vertebrates, there are also long-range enhancers, usually located outside the Hox gene cluster, which can loop-in to regulate multiple Hox genes in a tissue-specific way ^[31][32][35,36]. Long-range enhancers, seen in vertebrates, encourage compaction and exclusion of other, non-Hox, genes. Compact clusters of vertebrates (Figure 5) are likely a derived rather than ancestral condition ^[32][36].

Figure 5. Hox clusters are more compact in vertebrates than in non-vertebrates. Clusters shown are all apparently intact. Sizes shown are from: Ixodes ^[33][37]; Parhyale ^[34][38]; Apis ^[35][39]; Anopheles ^[33][37]; Nasonia ^[33][37]; Tribolium ^[36][40]; Saccoglossus ^[37][41]; sea urchin ^[38][42]; amphioxus ^[32][36]; mouse ^[33][37].

Other proposed reasons to maintain clustering are provided by the chromatin opening model, the Hox conjecture, both discussed below, and by the supposition that transcriptional activation and repression may each work best on nearby genes, as discussed in SectionSection 2.3. 2.3. Species that have lost their clustering have presumably overcome the above requirements. Transcriptional direction within Hox clusters is usually maintained, but the Dfd gene in Drosophila provides an example of gene inversion from the ancestral state (Figure 1). Inversion is normally selected against because the promoter of the inverted gene may fall under the regulation of neighbour gene enhancers, with serious consequences for development. For example, the well-known dominant Drosophila Antennapedia homeotic mutation in which legs develop instead of antennae is usually due to inversions in the Antp gene, causing it to become ectopically expressed by the enhancers of upstream, more anteriorly-expressed genes ^[39][40][43,44]. Darbellay et al.^[41] [45] inverted mouse Hoxd11 and d12, including displacement of a CTCF insulator element that normally separates them from Hoxd13. This caused an anterior shift in expression from the Hoxd13 locus, possibly due to its misregulation from a Hoxd11/d12, or long-range, enhancer. These authors also describe a Hoxd11 inversion that robustly suppressed expression from its Hoxd12 neighbour, and they propose that this may be due to the collision of transcription units on opposing DNA strands.

2.3. Seeking Sense in Spatial and Temporal Collinearities

Spatial collinearity describes the correspondence between the order of genes along the cluster (3′ to 5′) and the order of their expression domains (anterior-to-posterior) along the developing embryo. Temporal collinearity describes a correspondence between the order of genes along the cluster (3′ to 5′) and the time of their first expression (early to late) in the embryo. In addition to spatial collinearity in Hox expression, found to some extent in most bilaterians ^[5][6], many species including vertebrates ^[42], cephalochordates ^[43], some annelids ^[44] and some arthropods ^[45] display temporal collinearity. Many embryo types, including vertebrates, grow by successive addition of new parts at their posterior ends. That is, they develop in a head-to-tail temporal sequence from a posterior growth zone, and each new zone moves its overall Hox expression one step down in the pattern shown in Figure 2A-, right. The “chromatin opening” model was developed principally for embryos that develop from a posterior growth zone. It proposes that genes are expressed by progressive, timed, 3′ to 5′ opening of Hox cluster chromatin structure that permits and regulates the expression of Hox genes in their anterior-to-posterior temporal sequence (temporal collinearity) [46]. This means that temporal collinearity specifies the need for, and is dependent upon, spatial collinearity. If this was so ancestrally, then itwe can be likely predicted that the ancestral bilaterian already had temporal collinearity [47] and developed its body axis from a posterior growth zone [48]. While it is clear that there is indeed a progressive opening of the Hox cluster as gastrulation proceeds [9], it is not obvious whether this is a cause or a consequence of Hox gene activation. Some experimental evidence against the chromatin opening model has been reviewed earlier [8]. More recent work in mouse embryonic stem cell aggregates [49] indicates that CTCF-binding insulator elements are successively breached along the Hoxd cluster, with accompanying, progressive chromatin extrusion and change in Hox gene expression. This process in itself suggests a need for some level of spatial collinearity and would be in keeping with the chromatin opening model. However, the authors note that spatial and temporal collinearities persist even after deletion of these insulator elements, and they conclude that insulators regulate the pace and precision of collinearities rather than their organization ^[49][50] [49,50]. Duboule’s Hox Conjecture ^[51] [51] proposes that for reasons yet unclear, but not necessarily dependent upon chromatin opening, spatial collinearity may still be necessary to achieve temporal collinearity in Hox expression. Evidence for this is largely circumstantial and is that species displaying temporal collinearity have so far been found to have an intact, unbroken Hox gene cluster. Studies on the annelid Urechis unicinctus perhaps provide some evidence against this [52]. Its posterior Hox genes (Lox2 and Post2) are separated from more anterior genes (Hox1 to Lox4) by greater than 1Mb, as sub-clusters, yet the genes as a whole display spatial and temporal collinearities [52]. Conversely, however, an intact cluster need not necessarily imply temporal collinearity. For example, the hemichordate Saccoglossus ^[37][41] has an intact cluster, but apparently does not show temporal collinearity ^[53] [53]. An alternative “gene segregation” model [8] was proposed to account for why spatial collinearity is largely maintained even in species not showing either temporal collinearity or posterior growth zone, including perhaps the ancestral bilaterian. It could also be called the “single boundary” model. In this model, spatial collinearity is the primary feature and temporal collinearity follows as a necessary consequence in animals that develop from a posterior growth zone. This is so in these animals because anterior parts form first due to the earliest expression of anterior Hox genes, which are 3′-located. Posterior parts form last due to the late expression of posterior Hox genes, which are 5′-located. Overall, spatial collinearity thereby gives rise to temporal collinearity. The model does not dispute the importance of timing in Hox gene activation, especially so in species that develop from a posterior growth zone, but it does suggest that this may not depend primarily upon spatial collinearity, as it does in the chromatin opening model. Comparison of Figure 2A with Figure 2B shows how spatial collinearity (Figure 2A) provides the minimum number (one) of boundaries between expressible and non-expressible Hox genes within the cluster at all positions along the body (Figure 3). The single boundary model proposes that the bilaterian cluster was established and maintained with spatial collinearity because this arrangement allows a number of potential benefits: see [8] and references therein, and now briefly summarized as follows. (i) Minimal boundaries mean the minimum risk that gene repressive factors, e.g., Pc protein, will inadvertently spread from inactive to active genes; and also, a minimum risk that gene activating factors, e.g., Trx protein, will spread from active to inactive genes. (ii) All active genes can more readily be grouped together in a nuclear transcription factory where there may be higher local concentrations of transcription and other factors necessary for gene expression. Additionally, all non-expressible genes can be grouped together in a nuclear Pc body where there may be higher local concentrations of repressive factors. (iii) Segregated active and inactive genes are less likely to interfere sterically with each other in their movement to transcription factories and Pc bodies. (iv) Contiguity of repressed Hox genes allows potential cross-spreading between them of repressive components. Hox gene collinearity, well known to be ubiquitous in bilaterians, therefore remains an enigmatic feature of Hox gene expression. Two models which attempt to make sense of it are reviewed here. The ane ancestral bilaterian cluster was likely to have had an incompact Hox cluster, with vertebrate clusters having since become compacted ^[32][36]. Vertebrate Hox genes, perhaps becoming mutually more entangled in their regulation, might therefore not provide the ideal system to make sense of the ancestral state. There remains a need to investigate collinearity in a wide range of species.