Nucleosome formation is similar from humans to yeast, and even to
Archaea, as the basic structures of the folded histones are shared in various species. In the nucleosome, the central ‘disc’ architecture is composed of four core histones, H2A, H2B, H3, and H4, sharing well-conserved motifs termed histone-fold domains comprising three α-helices, α1, α2, and α3, separated by two loops, L1 and L2
[22][23] (
Figure 1A). The H2A–H2B and H3–H4 heterodimers are assembled into the characteristic handshake motif via central hydrophobic interactions, along with electrostatic contacts and hydrogen bonds
[8] (
Figure 1B,C). Two H3–H4 dimers are joined into a (H3–H4)
2 tetramer in physiological ionic strength solutions by the interactions of the C-terminal halves of the α2 helices of H3 and H3′ (adjacent copy of H3) and the α3 helix of H3. A single H3–H4 tetramer and two H2A–H2B dimers are incorporated into the nucleosome core particle to form the histone octamer, around which the double-stranded DNA is wrapped (
Figure 1D). Within the octamer, the H2A–H2B dimers are assembled by the interaction of H2B with H4, in addition to the H3–H3′ interaction. These remarkable associations are described as the H2B–H4 four-helix bundle and the H3–H3′ four-helix bundle, respectively
[8].
In the nucleosome, 145–147 base pairs of DNA are wrapped around the histone octamer in 1.65 left-handed superhelical turns with 14 contact sites on the histone surface, where the DNA phosphate backbone faces the histone octamer. Twelve of the 14 DNA–histone contact sites are mediated by the handshake motifs, which are divided into two types: the α1α1 and L1L2 regions that are enriched in positively charged residues (
Figure 1B,C). The α1α1 region is formed by the N-terminal ends of the α1 helices of the histone heterodimer pair (
Figure 1B,C). The L1L2 region is formed by the L1 and L2 loops and the C-termini of the α2 helices on each end of the heterodimer pair (
Figure 1B,C). A total of four α1α1 and eight L1L2 regions interact directly with 12 helical turns (120 bp) of the DNA in the minor grooves. The two remaining DNA–histone contact sites originate from the N-terminal α helix (αN) and the N-terminal extension of H3, where the positively charged residues bind to the DNA at the entry–exit sites of the nucleosome (
Figure 1D). Related to the studies of histone variants, some mutational analyses have clearly shown the contributions of arginine residues, at positions 42
[24], 49, and 53
[25] on the αN and N-terminal extension of H3, to the DNA flexibility at the entry–exit sites. Additionally, the interaction of the C-terminal region of H2A with the H3–H4 tetramer guides this last turn of the nucleosomal DNA (
Figure 1D). The histone variant H2A.B (formerly H2A.bbd) lacking the C-terminal region forms a nucleosome with ~110 base pairs of DNA wrapped around the octamer
[26][27]. The flexibility of the DNA at the entry–exit sites also plays a considerable role in the higher-order structure of chromatin.
In many eukaryotic nucleosomes, the acidic patch is a fundamental and well-conserved region located on the disc surface. The acidic side chains of amino acid residues in the α2 helix of H2A (Glu56, Glu61, Glu64, Asp90, Glu91, and Glu92 in human) and the αC helix of H2B (Glu105 and Glu113 in human) form a negatively charged patch exposed to the solvent (
Figure 1E). The acidic patch acts as a binding platform for many proteins that regulate chromatin function via arginine anchoring, including histone modification enzymes, chromatin remodelers, and DNA methyltransferases
[6][28]. In addition, X-ray crystallographic analyses have shown that the basic residues in the N-terminal tail of H4 contact the acidic patch, in a manner reminiscent of the nucleosome–nucleosome interactions in nucleosome arrays, suggesting that the acidic patch may also be involved in chromatin compaction
[28][29].
Histone proteins contain intrinsically disordered regions at the N-termini of H2A, H2B, H3, and H4, and the C-terminus of H2A (
Figure 1A,E). These flexible ‘tails’, which protrude from the central disc architecture of the nucleosome core, are partly exposed to the solvent. With their many positively charged residues, such as lysine and arginine, the histone tails associate with the negatively charged DNA in the nucleosome core particle or linker regions
[30][31]. However, the histone tails do not seem to affect the disc architecture of the nucleosome, as suggested from the X-ray structures of N-terminally truncated histones
[32]. In contrast, the histone tails likely alter the higher-order structure of chromatin through the interactions between the H4 tail and the acidic patch
[33]. Biophysical evidence has indicated the contribution of histone tails to the compaction of nucleosome arrays
[34]. Importantly, the histone tails are often enzymatically modified in vivo, by the acetylation, methylation, and ubiquitylation of lysine residues, and the phosphorylation of serine and threonine residues
[35]. These post-translational modifications serve as binding sites for factors that regulate the chromatin structure and function
[5][36][37]. Although the histone tail sequences are relatively diverse among eukaryotic species, the residues subjected to post-transcriptional modifications are highly conserved. In particular, the H3 Lys4, Lys9, Lys27, Lys36, and H4 Lys20 residues, which are targeted for methylation and acetylation, are inferred to be conserved lysine residues even in the last eukaryotic common ancestor (LECA)
[38].
3. Structures of Nucleosomes Containing Parasite Histones
3.1. Genomic Organization and Gene Regulation in Giardia
Giardia lamblia is a flagellated unicellular eukaryote that parasitizes the small intestines of humans and animals.
Giardia causes a diarrheal disease termed giardiasis, which is one of the most common parasitic infections throughout the world
[39][40]. The
Giardia life cycle consists of two main forms: the non-infectious trophozoite that causes the primary symptoms of diarrhea by proliferating on the surfaces of the host intestinal cells, and the highly infective cyst form, which is environmentally resistant and can survive outside the host. This parasite is classified in the order
diplomonadia within the
Excavata supergroup. In the eukaryotic tree of life,
Giardia is in a deeply branched position, with a simplification of most cellular processes, and lacking certain organelles, such as mitochondria and peroxisomes
[41].
Giardia has two nuclei with equivalent activities. Consistent with higher eukaryotes, genomic regulation occurs on the “beads-on-a-string” chromatin structure
[42]. The 12 mega base pairs of the
Giardia genome contain 4963 open reading flames with few introns, short intergenic and 3′ and 5′ untranslated regions, and small promoter regions that are close to transcriptional start sites
[43][44][45].
Giardia has a simple transcriptional initiation complex
[43], with few transcriptional factors
[46] and chromatin remodeler subunits
[47]. A genome-wide transcriptional analysis indicated that bidirectional transcription produces both sense and antisense transcripts in
Giardia [48]. In addition,
Giardia possesses an RNA interference system, which is involved in antigenic switching, conferring a pathogenic ability to evade the host immune system
[49]. These results suggest that transcriptional control is limited in
Giardia. In the encystation process, the transformation from trophozoite to cyst stage, histone modifications such as ubiquitylation, deacetylation, and methylation have been verified, indicating that the chromatin-based regulation of gene expression is essential for biological processes in
Giardia [50][51][52][53][54].
3.2. Structure of the Nucleosome Containing Giardia Histones
The
Giardia genome encodes two copies of the H2A, H2B, and H3 genes, and three copies of the H4 genes for canonical histones, and their corresponding mRNAs are polyadenylated
[44][55]. Histone variants H3B and CenH3 localized in the centromere were identified, but not the linker histone H1
[56]. In higher eukaryotes, the phosphorylation of an H2A variant, the H2A.X Ser139 residue (γH2A.X), is responsible for facilitating DNA repair in response to double-stranded breaks
[15][36][57]. Although no H2A variant has been identified in
Giardia, the “Ser-Gln-Asp-Leu” motif, within the H2A.X variant in higher eukaryotes, is present in the H2A C-terminus. The four core histones share the typical eukaryotic features of histone-fold domains, containing three α helices separated by two loops
[55]. Parasite histones generally have diverse amino acid sequences, with remarkably low sequence identity compared to metazoan histones
[58][59]. Indeed, the identities of the
Giardia lamblia (
G. lamblia) histone-fold domains of H2A, H2B, H3, and H4 compared to the human histone-fold domains are 48, 49, 60, and 78%, respectively. As in other eukaryotes, the
Giardia core histones contain intrinsically disordered “tail” regions that undergo conserved post-translational modifications. Emery-Corbin and colleagues identified
Giardia histone modifications, including methylation, acetylation, and phosphorylation, using mass spectrometry
[60]. They demonstrated that
Giardia histone tails are highly modified (more than 50 sites were identified) and that these histone modifications in
Giardia are largely equivalent to those in many other eukaryotes, suggesting that similar epigenetic mechanisms exist in this parasite
[60]. Among the well-conserved modifications, the phosphorylation of H3 Ser10, and the methylations of H3 Lys4, Lys36, Lys9 and Lys27, but not H3 Lys79 and H4 Lys20, were identified in
Giardia.
The cryo-EM structure of the
G. lamblia nucleosome was determined at a 3.6 angstrom resolution, after reconstitution with 145 base pairs of modified Widom 601 (601L) palindromic DNA
[61]. The overall structure of the
G. lamblia nucleosome resembles that of the human nucleosome (
Figure 2A). All
G. lamblia histones fold into the characteristic handshake motifs, but with notable deviations from the main chains of human histones in the nucleosome, including the
G. lamblia-specific insertions of six and two residues in H2B and H3, respectively (
Figure 2B,C). This six-residue insertion extends the α1 helix and L1 loop of
G. lamblia H2B and alters the conformation of the L1L2 region (
Figure 2B). Importantly, the insertion affects the shape and peptide-binding properties of the adjacent acidic patch (see below). In the cryo-EM structure of the
G. lamblia nucleosome, only 125 base pairs of DNA are wrapped asymmetrically around the histones, and the DNA flexibility was confirmed in solution (
Figure 2A). Furthermore, in the cryo-EM structure of the
G. lamblia nucleosome, only one of the shortened αN helices of H3 was resolved, and the two C-terminal regions of H2A were not visualized (
Figure 2D,E). In the human canonical nucleosome, the αN helix and N-terminal extension of H3 bind directly to the last turn of the superhelical DNA, and these interactions are considered to be guided by the C-terminal region of H2A via a hydrophobic cluster. In
G. lamblia, the hydrophobic residues of the H2A C-terminal region are replaced by hydrophilic or small aliphatic residues. A mutational analysis revealed that the C-terminal region of
G. lamblia H2A is responsible for the flexibility of the DNA entry–exit sites in the nucleosome. Consistent with this DNA flexibility, the nucleosome array containing
G. lamblia histones appears to adopt a more relaxed conformation, compared to the human nucleosome array. In addition, a biological analysis revealed the instability of the
G. lamblia nucleosome. These properties of relaxed chromatin and nucleosome instability could facilitate transcriptional activation in
Giardia, which possesses simple gene regulatory systems.
Figure 2. Nucleosome structure containing G. lamblia core histones. (A) Overall structure of the G. lamblia nucleosome (PDB ID: 7D69). (B,C) Handshake motifs of H2A–H2B (B) and H3–H4 (C) dimers in the G. lamblia nucleosome are superimposed with H2A–H2B and H3–H4 dimers in the human nucleosome (PDB ID: 6R93), respectively. (D) Entry–exit DNA regions of the G. lamblia nucleosome and the corresponding regions in the human nucleosome. Dashed lines represent predicted DNA. (E) Close-up views of the DNA and histones near the entry–exit DNA regions of G. lamblia and human nucleosomes with the H3 αN and α2 helices and L2 regions. The hydrophobic residues of the H2A C-terminal region form a hydrophobic core with the hydrophobic residues of H3, which is only visible in the human nucleosome. (F) Electrostatic potential surfaces and close-up views of the acidic patch of the G. lamblia and human nucleosomes are shown.