G4 Structures in Plants: History
Please note this is an old version of this entry, which may differ significantly from the current revision.

G-quadruplex (G4) oligonucleotides are higher-order DNA and RNA secondary structures of enormous relevance due to their implication in several biological processes and pathological states in different organisms.

  • G-quadruplex
  • G4
  • tetrads
  • DNA
  • RNA
  • plant

1. G4 Structures in Plants

Due to the fundamental role of plants in the development of traditional and modern medicine [1][2], and the importance of putative G-quadruplex (G4)-forming regions as a widely conserved set of nucleic sequences that could modulate gene regulation, research on plant G4s holds great relevance in medicinal plant improvement strategies. While the physiological implications of G4 RNA and DNA in plant species have not been much explored yet, their study is of crucial importance for the development of improved pharmaceutical crop varieties for sustainable extraction of therapeutic substances [3].

2. G4 DNA in Plants

Since G4 DNA formation is believed to be a molecular switch for gene expression, several genome-wide analyses of G4s have been reported for many species [4]. Thousands of potential G4-forming sequences have been identified in guanine-rich (G-rich) regions of eukaryotic telomeric and non-telomeric genomic regions. However, at the beginning of the last decade, only a few putative G4-forming regions were identified in plants. This led Takahashi et al. [5] to develop a two-step strategy to identify G4 motif regions (G4MRs) in plant genomes and classify them on the basis of their positional relationships with the transcription start and termination sites (TSS and TTS, respectively) of plant genes. By using computerized predictive methods, they exhaustively searched for G4 motifs in the whole genome of four plant species, Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, and Vitis vinifera. The results unveiled new rules for G4 motif regions in plant genomes and revealed consistent G4MR enrichments in the template strands at TSSs. Overall, the results of their study gave a precious contribution to the elucidation of the functional roles of G4s in plant DNA [5]. The actual or theoretical functions played by G4s in plants were recently reviewed by Griffin B.D. and Bass H.W [6]. Thousands of non-telomeric G4-forming regions have been discovered across several vegetal species, especially near plant gene promoters, suggesting that plant G4s may act as a ubiquitous family of cis-regulatory elements, i.e., non-coding DNA regions that regulate the transcription of neighboring genes. By comparative analyses, it was found that monocotyledons may exhibit up to ten times higher G4-forming region densities than eudicotyledons [6]. Moreover, several studies indicated that the highly diverse plant G4 structures are involved in the regulation of genes implicated in a number of pathophysiological conditions, such as DNA damage and stress response to biotic and abiotic stresses. In other words, more and more studies highlight the emerging functional role of plant G4 motifs in the development of improved crop varieties for sustainable agriculture [3].
Garg et al. identified different types of G4-forming DNA motifs in fifteen sequenced plants and analyzed their distribution in different genomic features, such as coding, promoter, intergenic regions, and gene bodies [7]. G4s with G2-repeats were abundantly detected in all the plant species under investigation. On the other hand, G4s with G3-repeats were linked to intronic, intergenic, and promoter regions, while G2-type G4s were enriched in exonic coding sequences and untranslated regions. Moreover, the study revealed specific sequences present in the conserved genes among monocotyledons and dicotyledons [7]. The same authors found that the genes implicated in development, cell size and growth, transmembrane transport, and gene expression regulation were enriched significantly. Furthermore, they revealed a strong co-occurrence of specific genic motifs with the G4 sequences in the promoter regions and validated the actual formation of G-quadruplexes by G-rich sequences found in several plants. The interaction of G4-forming DNA sequences with plant nuclear proteins was also detected in their study, which overall provided novel insights into the prevalence of G4-forming sequences in plants, demonstrating their association with different genomic aspects and functional evidence [7].
Retrotransposons with long terminal repeats are DNA tracts that form in a significant proportion of eukaryotic genomes. Especially found in plants, they are found in different genes and specific regulatory regions necessary for reverse transcription, and were found to contain several G4-forming sequences as emerged by a recent study conducted on more than 18,000 full-length retrotransposons with long terminal repeats collected from 21 plant species [8]. Specifically, the G4 motifs were often located in the long terminal repeats, both upstream and downstream of promoters, that lead to the whole retrotransposon transcription. Upstream-located G4s were dominant in the negative sense of the DNA strand, while downstream-located G4s prevailed in the positive DNA strand, revealing their role at the level of transcription as well as at the post-transcriptional stages [8]. Using circular dichroism (CD) spectroscopy, it was possible to demonstrate that these G-rich sequences were able to adopt quadruplex structures with different topologies, with some of them being folded as parallel while others as antiparallel G4 structures [8]. Moreover, potential triplex-forming sequences were revealed mainly in the 3′-untranslated region (3′UTR) and, to a lesser extent, in the 5′UTR. Overall, their study revealed a potential role of G4 and triplex DNA as regulator elements of several processes participating in the life cycle of long terminal repeat-containing retrotransposons and as potential recombination sites during the genome rearrangements based on retrotransposons [8]. Bioinformatic studies were also conducted to explore the biological pathways involving G4 structures in plants [9] that were previously studied more in the context of human diseases than that of vegetal organisms. In particular, the bioinformatic investigation of Volná et al. was accompanied by a complex CD spectroscopic study aimed at identifying stable G-quadruplexes in the gene RPB1, conserved across different plant and mammal species, which codes for the large subunit of the RNA polymerase II. It was shown that the RPB1 G4-forming locus is highly evolutionarily conserved among plants belonging to the Archaeplastida kingdom, sharing a common ancestor older than one billion years. The described plant G4s were also hypothesized to interact with UV light, potentially leading to an additional layer of the regulatory network [9].
Telomeres are structures consisting of repeats of the short G-rich sequence TTAGGG found in eukaryotes and especially studied in mammalians [10]. Located at the ends of linear chromosomes, they play fundamental roles in the context of genomic stability. While it is notorious that the mammalian telomeric G-rich repeats are able to form G4 structures able to modulate telomere functions [10], less studied are DNA telomeres in plants. These show TTTAGGG (in place of TTAGGG) DNA sequence repeats which play an essential role in plant growth and development, as well as in environmental adaptation [11]. Plant telomeric G4 structures were identified through bulk and single-molecule assays, including single-molecule FRET approaches and CD spectroscopy, that led to the complete characterization of the dynamics and the structure of the plant telomeric DNA sequence GGG(TTTAGGG)3. This typical telomeric sequence was able to fold into mixed G4 structures, including parallel and antiparallel topologies, in the presence of potassium cations. Intermediate dynamic transitions, including G-based hairpin, parallel triplex, and antiparallel triplex structures, were also detected. Interestingly, the model telomeric G4 structure was unfolded by AtRecQ2 helicase but left untouched by AtRecQ3 [11]. Another study aiming at highlighting the functional relevance of plant G4s in evolutionarily distinct plant species analyzed the genome of garden pea (Pisum sativum), a unique member of the Fabaceae family, showing that it contains several putative G4-forming DNA sequences [12]. Intriguingly, these G4 motifs were located nonrandomly in the nuclear genome of garden peas. Remarkably, other putative G4-forming sequences were found in chloroplast and mitochondrial DNA, and the G4 structure formation was experimentally confirmed for the sequences found in both organelles. The frequency of putative G4-forming sequences for nuclear DNA was in the same range as for chloroplast DNA (ca. 0.5/kbp) but significantly lower when compared to mitochondrial DNA (1.6/kbp) [12]. While putative G4-forming structures found in the nuclear genome were associated mainly with regulatory regions, including 5′UTRs, as well as upstream of the ribosomal RNA region, they were located around RNA genes in mitochondrial DNA and chloroplast DNA. The non-random localization of putative G4-forming sequences uncovered their functional and evolutionary significance in the garden pea genome [12]. Using bioinformatics techniques, Wang et al. explored different putative G4 motifs in several genomic regions of Oryza sativa, in particular studying two subspecies (indica and japonica) and the whole genome of eight other plant species [13]. After this analysis was performed on all ten plant species, they found G4 motif density in monocotyledons higher than in dicotyledons. A wide distribution of putative G4-forming DNA sequences was found in the O. sativa genome. The G4 motifs were more abundantly located into 5′UTR and near transcription start sites with relatively high enrichment [13] leading to the hypothesis that G4 in the plant species investigated was involved in gene transcription and consequent translation. Moreover, analyzing the distribution of different loop lengths in G4, the same authors estimated the density of putative G4-forming sequences in the long loop that was lower than the short loop in the intron of indica subspecies, while it did not differ significantly from that found in japonica. In addition, focusing their attention on the loci with putative G4-forming sequences and conducting gene ontology analysis of them, Wang et al. identified several gene ontology terms that were highly correlated with the loci containing at least one G4 motif. The gene ontology analysis in the two subspecies of Oryza sativa furnished a useful example for elucidating the functional roles of G4 in plants [13].

3. G4 RNA in Plants

Plants contain several non-coding RNA G4 structures endowed with regulatory functions common also to viruses, prokaryotes, protozoa, and humans [14]. On the other hand, Yang et al. identified several RNA sequences rich in guanine in the plant transcriptome, whose folding potential was profiled in vitro and which were revealed to be potentially able to form G4 structures [15]. More in detail, using both high-throughput sequencing and cell imaging methods, the same authors detected RNA G4s at the genome-wide scale as well in living cells [16]. A global abundance of RNA G4 motifs with two G-quartets was observed, with the global RNA folding potential being highly influenced by these four-stranded secondary structures. Remarkably, both in vitro and in vivo RNA chemical structure profiling techniques revealed hundreds of RNA G4 structures strongly folded in both mouse-ear cress (Arabidopsis thaliana) and rice (Oryza sativa) and furnished for the first time direct evidence of the formation of RNA G4 structures in living eukaryotic cells [15]. Furthermore, biochemical and genetic analyses indicated that RNA G4 folding regulates the translation process, ultimately modulating plant growth. Overall, Yang et al. not only demonstrated for the first time the existence of RNA G4s in vivo but also indicated that RNA G4 structures play different and often not sufficiently explored roles in the regulation of plant growth and development [15]. Not less importantly, recent investigations have also emphasized the central role that RNA structures play in plant adaptation. More specifically, among the several highly complex structures of RNA, G4s widespread across the transcriptomes of a number of plant species, as evidenced by several computational predictions and also experimentally demonstrated [16], are regulatory motifs in vegetal organisms important for their adaptation to the most diverse environmental factors along with evolutionary perspectives [16]. Aiming at investigating the role of nucleotide composition in determining gene functionality and the ecological adaptation of plant species to distinct environmental conditions and the underlying biological function of nucleotide composition determining the environmental adaptations, Yang et al. recently systematically studied the nucleotide compositions of transcriptomes across 1000 plants and their corresponding habitats [17]. Interestingly, it emerged that plants growing in cold climates have G-enriched transcriptomes, which can readily fold into RNA G4 structures. By immunofluorescence detection and in vivo structure profiling studies they found that RNA G4 structure formation in plants was significantly enhanced in response to cold [17]. Cold-responsive RNA G4 structures were found to strongly enhance the stability of mRNA, rather than affecting its translation. Conversely, disrupting the RNA led to mRNA decay in the cold, and impaired responses of the plant to the cold. The results of their study suggested therefore that evolutionarily plants adopted RNA G4 structure as a molecular marker to improve their adaptation to the cold [17]. The folding of fragments of transfer RNA (tRNA) into G4 structures and the implications of G4 in translational inhibition have been studied in plants and compared with mammalian systems. In particular, the influence of human and plant fragments of tRNA and model G4 structures on translation in wheat germ extract and rabbit reticulocyte lysate was demonstrated by Jackowiak et al. [18], who were able to associate the efficiency of translational inhibition in the mammalian system with the type of G4 topology. However, the same authors observed that in plants, the ability of a small RNA to adopt the G4 structure was not sufficient to block the translation process, suggesting that other structural determinants are implicated in this feature [18]. In the context of the exploration of G4 structure formation and the consequent biological role in plants, an experimental study based on CD titration, UV melting, in-line probing and reporter gene assay studies led to the discovery of a plant RNA G4 structure that was able to inhibit the RNA translation in Arabidopsis thaliana [19]. Such a G4 motif was located within the 5′UTR of the mRNA and the G4 structure was found to be highly stable and thermodynamically favored over a competing hairpin structure in the 5′UTR at physiological potassium and magnesium concentrations. Transient reporter gene assays conducted in living plants showed that the G4 structure inhibited the translation but not the transcription process, indicating this G4 structure as a translational repressor in vivo. Moreover, the in-line probing assay led to the elucidation of the secondary structure of the RNA supporting the formation in vitro of the G4 structure in the context of the complete 5′UTR [19].

4. Plants G4 Binding Proteins

Aware of the importance of proteins able to bind G4 structures [20][21] as valuable targets for strategies aiming at modulating G4-related processes in different organisms, Volná et al. also investigated them bioinformatically in plants. G4 binding proteins were screened, inspecting the available plant protein sequences in order to detect the best protein candidates with the potential to bind G4 structures [22]. The authors started from the consideration that two similar arginine and glycine-rich G4-binding motifs were previously reported in humans: the so-called “RGG motif” (with the amino acid sequence RRGDGRRRGGGGRGQGGRGRGGGFKG), and the more recently described “NIQI motif” (whose sequence is RGRGRGRGGGSGGSGGRGRG). With this information in mind, they screened plant proteins that included the abovementioned motifs in their amino acid sequences using two bioinformatic approaches (BLASTp and FIMO scanning) [22]. They found numerous proteins containing the G4-binding motifs in common with humans and were able to describe the core proteins involved in G4 folding and resolving in algae and green plants, including Arabidopsis thaliana, the plant model organism of their study. The emerged G4-binding protein candidates were sorted by their physiological and molecular functions and were hypothesized to play significant roles in the regulation of gene expression in plants [22].

This entry is adapted from the peer-reviewed paper 10.3390/pharmaceutics14112377

References

  1. Ahvazi, M.; Khalighi-Sigaroodi, F.; Charkhchiyan, M.M.; Mojab, F.; Mozaffarian, V.-A.; Zakeri, H. Introduction of medicinal plants species with the most traditional usage in Alamut region. Iran. J. Pharm. Res. IJPR 2012, 11, 185.
  2. Ramawat, K.; Sonie, K.; Sharma, M. Therapeutic potential of medicinal plants: An introduction. Biotechnol. Med. Plants Vitalizer Ther. 2004, 2004, 1–18.
  3. Yadav, V.; Hemansi; Kim, N.; Tuteja, N.; Yadav, P. G Quadruplex in Plants: A Ubiquitous Regulatory Element and Its Biological Relevance. Front. Plant Sci. 2017, 8, 1163.
  4. Lyu, J.; Shao, R.; Kwong Yung, P.Y.; Elsässer, S.J. Genome-wide mapping of G-quadruplex structures with CUT&Tag. Nucleic Acids Res. 2022, 50, e13.
  5. Takahashi, H.; Nakagawa, A.; Kojima, S.; Takahashi, A.; Cha, B.-Y.; Woo, J.-T.; Nagai, K.; Machida, Y.; Machida, C. Discovery of novel rules for G-quadruplex-forming sequences in plants by using bioinformatics methods. J. Biosci. Bioeng. 2012, 114, 570–575.
  6. Griffin, B.D.; Bass, H.W. Review: Plant G-quadruplex (G4) motifs in DNA and RNA; abundant, intriguing sequences of unknown function. Plant Sci. 2018, 269, 143–147.
  7. Garg, R.; Aggarwal, J.; Thakkar, B. Genome-wide discovery of G-quadruplex forming sequences and their functional relevance in plants. Sci. Rep. 2016, 6, 28211.
  8. Lexa, M.; Kejnovsky, E.; Steflova, P.; Konvalinova, H.; Vorlickova, M.; Vyskot, B. Quadruplex-forming sequences occupy discrete regions inside plant LTR retrotransposons. Nucleic Acids Res. 2013, 42, 968–978.
  9. Volná, A.; Bartas, M.; Karlický, V.; Nezval, J.; Kundrátová, K.; Pečinka, P.; Špunda, V.; Červeň, J. G-Quadruplex in Gene Encoding Large Subunit of Plant RNA Polymerase II: A Billion-Year-Old Story. Int. J. Mol. Sci. 2021, 22, 7381.
  10. Neidle, S.; Parkinson, G.N. The structure of telomeric DNA. Curr. Opin. Struct. Biol. 2003, 13, 275–283.
  11. Wu, W.-Q.; Zhang, M.-L.; Song, C.-P. A comprehensive evaluation of a typical plant telomeric G-quadruplex (G4) DNA reveals the dynamics of G4 formation, rearrangement, and unfolding. J. Biol. Chem. 2020, 295, 5461–5469.
  12. Dobrovolná, M.; Bohálová, N.; Peška, V.; Wang, J.; Luo, Y.; Bartas, M.; Volná, A.; Mergny, J.-L.; Brázda, V. The Newly Sequenced Genome of Pisum sativum Is Replete with Potential G-Quadruplex-Forming Sequences—Implications for Evolution and Biological Regulation. Int. J. Mol. Sci. 2022, 23, 8482.
  13. Wang, Y.; Zhao, M.; Zhang, Q.; Zhu, G.-F.; Li, F.-F.; Du, L.-F. Genomic distribution and possible functional roles of putative G-quadruplex motifs in two subspecies of Oryza sativa. Comput. Biol. Chem. 2015, 56, 122–130.
  14. Tassinari, M.; Richter, S.N.; Gandellini, P. Biological relevance and therapeutic potential of G-quadruplex structures in the human non-coding transcriptome. Nucleic Acids Res. 2021, 49, 3617–3633.
  15. Yang, X.; Cheema, J.; Zhang, Y.; Deng, H.; Duncan, S.; Umar, M.I.; Zhao, J.; Liu, Q.; Cao, X.; Kwok, C.K.; et al. RNA G-quadruplex structures exist and function in vivo in plants. Genome Biol. 2020, 21, 226.
  16. Liu, H.; Chu, Z.; Yang, X. A Key Molecular Regulator, RNA G-Quadruplex and Its Function in Plants. Front. Plant Sci. 2022, 13, 926953.
  17. Yang, X.; Yu, H.; Duncan, S.; Zhang, Y.; Cheema, J.; Miller, J.B.; Zhang, J.; Kwok, C.K.; Zhang, H.; Ding, Y. RNA G-quadruplex structure contributes to cold adaptation in plants. Nat. Commun. 2022, 13, 6224.
  18. Jackowiak, P.; Hojka-Osinska, A.; Gasiorek, K.; Stelmaszczuk, M.; Gudanis, D.; Gdaniec, Z.; Figlerowicz, M. Effects of G-quadruplex topology on translational inhibition by tRNA fragments in mammalian and plant systems in vitro. Int. J. Biochem. Cell Biol. 2017, 92, 148–154.
  19. Kwok, C.K.; Ding, Y.; Shahid, S.; Assmann, S.M.; Bevilacqua, P.C. A stable RNA G-quadruplex within the 5′-UTR of Arabidopsis thaliana ATR mRNA inhibits translation. Biochem. J. 2015, 467, 91–102.
  20. Pipier, A.; Devaux, A.; Lavergne, T.; Adrait, A.; Couté, Y.; Britton, S.; Calsou, P.; Riou, J.; Defrancq, E.; Gomez, D. Constrained G4 structures unveil topology specificity of known and new G4 binding proteins. Sci. Rep. 2021, 11, 13469.
  21. Meier-Stephenson, V. G4-quadruplex-binding proteins: Review and insights into selectivity. Biophys. Rev. 2022, 14, 635–654.
  22. Volná, A.; Bartas, M.; Nezval, J.; Špunda, V.; Pečinka, P.; Červeň, J. Searching for G-Quadruplex-Binding Proteins in Plants: New Insight into Possible G-Quadruplex Regulation. BioTech 2021, 10, 20.
More
This entry is offline, you can click here to edit this entry!
ScholarVision Creations