The small nucleolar RNA snR30 (U17 in humans) plays a unique role during ribosome synthesis. Unlike most members of the H/ACA class of guide RNAs, the small nucleolar ribonucleoprotein (snoRNP) complex assembled on snR30 does not direct pseudouridylation of ribosomal RNA (rRNA), but instead snR30 is critical for 18S rRNA processing during formation of the small subunit (SSU) of the ribosome. Specifically, snR30 is essential for three pre-rRNA cleavages at the A0/01, A1/1, and A2/2a sites in yeast and humans, respectively. Accordingly, snR30 is the only essential H/ACA guide RNA in yeast. However, the molecular mechanism of snR30 and how it promotes pre-rRNA processing remains under investigation.
Eukaryotic ribosomes are synthesized in the nucleolus with the help of hundreds of assembly factors. In a highly orchestrated fashion, multiple protein and RNA assembly factors are recruited to nascent pre-ribosomal RNA (pre-rRNA) and depart at defined times, whereas ribosomal proteins also associate in a defined order, eventually forming ribosomal subunits together with the matured rRNAs. Among the ribosome assembly factors, the nucleolar snR30 is an essential H/ACA RNA that is critical for the formation of the small ribosomal subunit, but only limited information is available on the mechanistic role of snR30 during the complex process of ribosome formation .
Ribosome biogenesis occurs co-transcriptionally on the pre-rRNAs, and snR30 is also thought to act co-transcriptionally during the processing of 18S rRNA. Three of the four mature rRNAs (18S, 5.8S, and 25S rRNA) are transcribed by RNA polymerase I as a single 35S precursor rRNA (pre-rRNA, 47S pre-rRNA in humans), whereas the fourth rRNA, 5S rRNA, is transcribed independently by RNA Polymerase III. In addition to the mature rRNA sequences, the 35S pre-rRNA contains two external and two internal transcribed spacers abbreviated as ETS and ITS respectively, which are removed through several nucleolytic cleavage events such that these regions are not present in the mature ribosome. The initial site of pre-rRNA cleavage, that separates the 18S rRNA from the remaining pre-rRNA, are called A sites, specifically A0, A1, A2, and A3 in yeast and 01, 1, and 2 in humans (Figure 1) . Notably, snR30 is involved in pre-rRNA cleavage at sites A0, A1, and A2.
Figure 1. Overview of rRNA processing in yeast. The mature 18S, 5.8S, and 25S rRNAs are generated from a 35S pre-rRNA precursor through several nucleolytic cleavage events, shown here generating characteristic pre-rRNA intermediates. For each step, the subsequent site(s) of cleavage is highlighted in red. snR30 is essential for cleavage at sites A0, A1, and A2, which are highlighted in blue.
During ribosome biogenesis, snR30 does not act alone, but rather forms a stable ribonucleoprotein (RNP) that belongs to the family of H/ACA small nucleolar RNPs (snoRNPs). The snoRNPs represent a large portion of the ribosome biogenesis factors found in the nucleolus, and most snoRNPs chemically modify rRNA, but notably, this is not the case for the snR30 snoRNP. Occurring in the form of either C/D box or H/ACA box guide RNA systems, snoRNPs direct site-specific 2′-O-methylation and pseudouridylation, respectively . Each of these complexes is composed of one guide RNA, also called small nucleolar RNA (snoRNA), and four core proteins. The proteins Nop1 (Fibrillarin in vertebrates), Nop56, Nop58, and Snu13 (15.5K) assemble on C/D box guide RNAs, forming C/D snoRNPs , whereas the proteins Cbf5 (Dyskerin), Nop10, Gar1, and Nhp2 are components of the H/ACA box snoRNP together with H/ACA box guide RNA . During ribosome assembly, the H/ACA snoRNPs introduce many pseudouridines in rRNA which play an important role in the maintenance of ribosome structure, stability, and translational fidelity .
Interestingly, of the more than one hundred snoRNAs known in yeast, only three guide RNAs are essential, namely U3, U14, and snR30. In addition, deletion of S. cerevisiae snR10 results in slow growth, and a cold-sensitive phenotype . Remarkably, the primary function of these essential snoRNAs is not rRNA modification, but they are all directly or indirectly required for rRNA processing during ribosome synthesis. Additionally, snR30 may also play a role in cholesterol trafficking in higher eukaryotes . In comparison to snR30, much more is known about the essential U3 snoRNA, a C/D box RNA that is responsible for the correct folding of the central pseudoknot in the 18S rRNA . The central pseudoknot is an SSU rRNA structure conserved from prokaryotes to higher eukaryotes that connects the different domains of the SSU rRNA . To facilitate pseudoknot formation, U3 makes multiple essential base pair interactions to the pre-rRNA in both the 5′ ETS and the 18S rRNA . In particular, the box A and A′ motifs in the U3 snoRNA bind to the 18S rRNA regions which form the central pseudoknot. When U3 is deleted, biogenesis of the small subunit is halted early due to the U3 snoRNP having a central role in biogenesis. This leads to accumulation of 23S pre-rRNA due to an inability to process at A0, A1, or A2 . Lastly, U14 is another C/D RNA that is essential for 18S rRNA formation . U14 appears to have two distinct roles, as one region, that base pairs to the pre-rRNA and is not essential, guides the introduction of a 2′-O-methylation at C414 in 18S rRNA, whereas another region is essential and binds to the 18S rRNA extensively in the 5′ domain of the 18S rRNA . Depletion of U14’s essential region leads to the exact same phenotype as depletion of U3 or snR30. Thereby, U14 resembles the H/ACA box RNA snR10 which also has dual functions by directing pseudouridine formation at position 2923 in 25S rRNA and by contributing to 18S rRNA processing. In cases where snR10 is lacking, the phenotype is similar to a knockout of the essential genes, except less pronounced. There is minor 35S pre-rRNA accumulation that leads to an increase in 23S pre-rRNA and a 21S pre-rRNA product. However, at permissive temperatures, this pre-rRNA is still processed, albeit slowly, into mature 18S rRNA .
Knowledge of ribosome biogenesis, and specifically small subunit biogenesis, has significantly increased due to recent advances in cryo-electron microscopy (cryo-EM), yielding high-resolution structures of the SSU processome complex , but no structural insight of snR30 bound to the SSU processome is available to date. The recent cryo-EM structures provide insight into the interactions within the SSU processome which has led to advances in both the understanding of protein positioning and timings of large-scale rearrangements in the small ribosomal subunit during ribosome biogenesis. However, many of the early-acting factors, such as the snR30 snoRNP, are not present in the structures solved so far, presumably because the snR30 snoRNP has already dissociated before formation of the stable SSU processome. Therefore, no structural information is currently available on the interaction of snR30 with the SSU processome, but instead, our knowledge is derived mostly from genetic and biochemical studies. For example, by tagging and purifying a known biogenesis factor, the timing of SSU processome association of this factor can be roughly estimated by identifying other co-purified factors . More recently, different ribosomal precursors have also been studied by expressing and purifying truncated and tagged pre-rRNA variants followed by proteomic analysis, providing additional information on when assembly factors associate and dissociate during pre-rRNA transcription . Notably, taking such an approach, snR30 was detected in early ribosomal intermediates that arise before 18S rRNA transcription is complete, but it was no longer detected in later intermediates . These findings suggest that snR30 acts relatively early during ribosome biogenesis when the structure of the SSU processome might still be rather flexible and labile.
The early stages of eukaryotic ribosome synthesis are characterized by a dynamic interaction network of several factors, including snR30. Ribosome biogenesis starts with the binding of the UTP-A complex to the 5′ ETS of the 35S pre-rRNA transcript. Subsequently, the U3 snRNP, UTP-B, and Mpp10 complexes associate . While these complexes contain a large portion of the assembly factors, there are many individual proteins, small complexes, and modification enzymes which also act during this period. snR30 is part of one of these complexes that seems to only be present until the 3′ minor domain of 18S rRNA is transcribed . The primary interaction point of snR30 is within expansion segment 6 (ES6) in the central domain, which is unstructured and therefore not resolved in the cryo-EM structures of the SSU processome . The possible function of snR30 binding for pre-rRNA folding will be further discussed below.
By recruiting all these assembly factors, the pre-rRNA in the SSU processome is held in a state where the mature subunit is recognizable, but not fully formed, as evident in the cryo-EM structures , and snR30 may contribute to preventing premature interactions as further discussed below. One of the most striking differences between the SSU processome structure compared to the mature 40S subunit is that the central pseudoknot is unable to form due to the presence of Sas10, Lcp5, the U3 snRNP, and other factors . Furthermore, the SSU processome separates the four domains of the 18S rRNA into independent units which can fold separately . More recent structural information from the thermophilic fungus Chaetomium thermophilum has suggested that the 5′, central, 3′ major, and 3′ minor domains of 18S rRNA are not sequentially integrated into the pre-40S subunit, but that the 5′ domain may join the SSU processome last; however, it remains to be investigated whether this finding holds true in other organisms . Thus, the early stages of SSU formation are characterized by dynamic interactions and several conformational changes in rRNA which are facilitated by many factors, including snR30.
In summary, research into ribosome biogenesis is a rapidly evolving field as the concerted functions of hundreds of poorly understood factors and steps including snR30 need to be identified.
In unicellular eukaryotes, such as S. cerevisiae, snR30 is transcribed as an independent gene by RNA Pol II, the mRNA producing polymerase. Contrary to mRNA, the snoRNA undergoes processing to remove the polyA tail by initially binding the Nrd1-Nab3-Sen1 complex which presents the snR30 3′ end to the exosome for exonucleolytic processing . On the 5′ end, snR30 possesses a trimethyl guanosine cap like most other H/ACA snoRNAs in yeast , which is added by the conserved methyltransferase Tgs1 .
In multicellular eukaryotes such as humans, transcription and processing of U17a and U17b, two human homologs of snR30, is significantly different. Both U17a and U17b are encoded by the U17HG gene upstream of the cell cycle regulatory gene RCC1 . Interestingly, the U17HG gene harbors U17a and U17b in two introns; however, the U17HG gene seems not to encode a protein and only the U17 sequences are conserved . However, in other species, the location of the U17 gene varies such as in X. laevis, where U17 is transcribed in all six introns of the r-protein S8 gene . Following transcription of the host gene, the excised intron containing U17 is then processed exonucleolytically at both the 5′ and 3′ ends in both Xenopus and HeLa cells, implying a conserved mode of maturation . The 3′ end is processed by the exosome , whereas the 5′ end is processed by an unknown endonuclease . Further information on snoRNA biogenesis and processing events have been reviewed by Kufel and Grzechnik .
In yeast, the core snR30 RNP comprising the RNA and the four canonical H/ACA proteins is formed in the same way as other H/ACA snoRNPs that modify rRNA. This process begins with formation of a protein complex initiated by the binding of the protein Shq1 to Cbf5 in the cytoplasm . Shq1 associates with the RNA-binding domain of Cbf5 mimicking interactions of H/ACA snoRNA, which is known to be strongly bound by Cbf5 . The Shq1-Cbf5 complex is then shuttled into the nucleus where the majority of RNP maturation takes place. During this process, Nop10, Nhp2, and an assembly factor called Naf1 bind, forming a large protein complex . At the site of snR30 transcription, a pair of ATPases catalyzes the release of Cbf5 from Shq1, freeing the enzyme to tightly bind the snoRNA in its place . This interaction is mediated by Naf1 binding to the large subunit of RNA polymerase II . After binding of Cbf5 to the H/ACA guide RNA, the subsequent maturation step involves the competitive binding of Gar1 to Cbf5, resulting in the displacement of Naf1 which is recycled back to the cytoplasm for further maturation. The now fully assembled H/ACA snoRNP is shuttled from Cajal bodies to the nucleolus by Nopp140 . Localization signals for transport into the nucleolus appear to be located in the H and ACA boxes, as well as the general structure of the guide RNA, at least for vertebrates . For U17, the ACA box is required for assembly in vitro and presumably in vivo, signifying that nucleolar localization is dependent upon formation of the RNP . As only mature snoRNPs are found within the nucleolus, maturation must occur between transcription in the nucleoplasm and import into the nucleolus . It is possible that some steps of snR30 maturation may occur within Cajal bodies similar to the maturation of other snoRNAs. Immature C/D box RNPs as well as a subgroup of H/ACA RNAs called H/ACA small Cajal body RNAs (scaRNA) can be readily detected within Cajal bodies, and some evidence suggests that H/ACA snoRNPs may in general also traverse Cajal bodies . In conclusion, current evidence suggests that the maturation of the snR30/U17 RNP follows the same steps as canonical H/ACA guide RNA as all H/ACA RNAs including snR30 assemble with the same proteins.
Typical H/ACA guide RNAs in eukaryotes share a similar secondary structure comprised of two hairpins connected by a hinge region. The two hairpins are followed by two conserved sequence motifs, the H Box (ANANNA) and the ACA Box, respectively. While most known H/ACA RNAs contain two hairpins in eukaryotes, there are instances of H/ACA guide RNAs having one or three hairpins in selected eukaryotic organisms as well as in archaea. snR30 is an unusual H/ACA guide RNA that has two primary hairpins, the 5′ and 3′ hairpins, but also possesses a third internal hairpin as well as a 41-nt leader sequence at its 5′ end (Figure 2). Notably, the 5′ hairpin of snR30 is much longer than a standard H/ACA hairpin such that S. cerevisiae snR30 has an unusual length of 606 nt, almost triple the length of the average yeast H/ACA guide RNA (~200 nt). While not found in all H/ACA snoRNAs, the internal hairpin and the 41-nt leader are both features that are also present in some other H/ACA guide RNAs . Interestingly, snR30 lacks an unpaired internal bulge following the first stem of the 5′ hairpin, a feature known as the pseudouridylation pocket in standard H/ACA guide RNAs. In contrast, the 3′ hairpin contains a single-stranded bulge like all other H/ACA guide RNA hairpins, and the top of the bulge is located at a conserved 14–16 nucleotide distance from the base of the hairpin and the ACA box . In H/ACA snoRNAs directing pseudouridylation, this distance is important for properly positioning the guide RNA on the Cbf5-Nop10-Nhp2 binding surface, allowing binding of the target RNA to the pseudouridylation pocket and positioning of the target uridine into the active site of Cbf5 . In the 5′ hairpin of snR30, the only similar bulge occurs too far away from the base of the stem and the H box for correct positioning of Cbf5. Accordingly, no pseudouridine has been suggested to be introduced by the 5′ hairpin of snR30. Similar to canonical H/ACA snoRNAs and based on the location of the H and ACA Boxes, snR30 is expected to bind one set of the H/ACA core proteins (Cbf5, Nop10, Gar1, and Nhp2) to each of the 5′ and 3′ hairpins, resulting in a predicted 2:1 stoichiometry between the proteins and the snR30 RNA. Hence, despite its elongated structure, the only unique aspect of snR30 compared to modification H/ACA snoRNAs is that its 5′ hairpin does not have any known RNA targets.
Human U17 RNA (207 nt) is shorter than yeast snR30 and comprises four hairpins (Figure 3B), forming a secondary structure consisting of a 5′-variable domain and a 3′-conserved domain . Thus, the comparison of yeast snR30 and human U17 can reveal functionally important regions of this conserved H/ACA snoRNA. During evolution, the 5′ region of snR30/U17 was compacted to a smaller size, effectively reducing transcriptional cost, which is similar to the general trend of guide RNA shortening between single-cell and complex eukaryotes . In humans, the 5′ end of U17 is composed of two stems of similar size prior to the H box. Whereas the H box of U17 is not required for in vitro RNP formation, the H box of snR30 is critical for accumulation of snR30 in vivo . U17 also contains an internal hairpin, although it is much smaller than that of snR30. Since the 5′ structure of snR30/U17 is not conserved, mutational studies investigated whether the 5′ and internal hairpins of yeast snR30 are critical for cell viability . Indeed, both the 5′ hairpin and the internal hairpin can be individually deleted without affecting cell viability, and cell viability was maintained at a reduced level when replacing both the 5′ and internal hairpins with the 5′ hairpin of another box H/ACA RNA. Therefore, the 5′ hairpin and the internal hairpin of yeast snR30 are not directly responsible for its essential role within the cell. In contrast to the 5′ domain, the 3′ hairpin of U17 is extremely similar to that of snR30 as they both possess a structure identical to a standard H/ACA guide RNA hairpin including an unpaired bulge. As further outlined below, the conserved nature of the 3′ hairpin already indicates that this region in snR30/U17 is functionally most important.
By aligning the sequences of the snR30/U17 species from yeast, Xenopus, and humans, two strongly conserved sequence motifs in the 3′ hairpin were discovered and called m1 and m2, which are critical for ribosome formation . The m1 and m2 regions are located in the non-productive, unpaired ‘pseudouridylation pocket’ on the 3′ hairpin of snR30/U17. However, rather than being located on the distal side of the pocket where modification H/ACA guide RNAs bind their targets, they are located on the basal side of the pocket. Two complementary sequences in 18S rRNA were discovered and designated as rm1 and rm2, and mutational studies confirmed Watson-crick base pairing between the m1/rm1 and m2/rm2 sequences that is necessary for pre-rRNA processing . While there is some variation in the m1 and m2 sequences across eukaryotes, these are always matched by compensatory mutations in the rm1 and rm2 motifs in 18S rRNA (Table 1), underlining the importance of this base-pairing of snR30/U17 with 18S rRNA for ribosome biogenesis.
Figure 2. Secondary structures of yeast snR30 and human U17 and their modus operandi of pre-RNA binding. The H and ACA sequences, that characterize each H/ACA snoRNA, are boxed. The m1 and m2 motifs are labelled and the base-pairing to target pre-rRNA is shown in red. Additional regions of snR30/U17 predicted to have a function such as forming further interactions with 18S rRNA are depicted in blue (C2 and C3 in S. cerevisiae snR30, rRSCIII in human U17). HP, hairpin; IHP, internal hairpin. Minor bulges and imperfect base-pairing are not represented. snR30 was adapted from Atzorn et al. , and U17 was adapted from Ruhl et al. .
In addition to the critical and highly conserved m1 and m2 regions, additional elements of snR30 or U17 have been identified that also bind to 18S rRNA, although these secondary interactions are not conserved across all species. In S. cerevisiae, crosslinking, ligation, and sequencing of hybrids (CLASH) uncovered additional areas of interaction between snR30 and 18S rRNA . Notably, two of the strongest interaction sites, called C2 and C3, also contain regions of sequence complementarity between snR30 and 18S rRNA and are conserved among fungi (Figure 3). First, a region in the 5′ hairpin of snR30 is proposed to interact with helix 1 in expansion segment 6 (ES6) of 18S rRNA in close vicinity to the interaction of the m1 and m2 regions with helix 3 of ES6. However, the importance of this interaction remains to be investigated since the 5′ hairpin of snR30 is dispensable. Second, a 19-nt region within the internal hairpin of snR30 has the potential to base-pair to expansion segment 7 (ES7) in 18S rRNA, which is also conserved in Schizosaccharomyces pombe. Whereas these interactions sites are likely specific to fungi, other putative contacts between U17 and 18S rRNA have been reported in vertebrates. In humans, the U17 rRCSIII sequence in stem 2 of the 5′ domain is predicted to base-pair with 18S rRNA at positions 967–976 (Figure 2) , and this sequence complementarity is conserved not only in mammals, but also birds, fish, amphibians, and reptiles. An additional element in stem 1 of the 5′ domain, called rRCSI, may form 12 base-pairs to the 18S rRNA, but is only conserved in fish and amphibians, not in mammals . It remains unknown whether these predicted contacts between vertebrate U17 and 18S rRNA form in vivo and whether they are of functional importance for ribosome biogenesis. Due to the divergence of snR30/U17 sequence and structure over evolution, different interactions with 18S rRNA may have formed that may serve similar functions in stabilizing binding of this H/ACA snoRNA to the SSU processome.
Table 1. Comparison of snR30/U17 RNA across eukaryotes.