Viroids are plant-restricted parasites that represent a remarkable model system to analyze many aspects of host-pathogen interactions at the genomic level . As the smallest known agents of infectious disease (247-401 nucleotides, nt), they have a highly structured, single-stranded circular naked and non-coding RNA genome. Although the list of known diseases caused by viroids and molecular characterization of the causative agents has expanded since they were discovered, their origin, evolution, and interaction with host genetic machinery to induce symptoms or escape the defensive system remain unclear.
Viroids were discovered by Dr. Theodor Diener while undergoing efforts to identify the cause of a potato spindle tuber disease. In the first instance, the disease was believed to be caused by a virus. Experiments designed to identify the hypothetical presumed virus yielded unexpected results. Observations revealed that most of the infectious agent present in extracts from diseased plants did not sediment into a pellet when subjected to a centrifugal force that was enough to sediment all known viruses. Density gradient centrifugation and polyacrylamide gel electrophoresis confirmed that the infectious agent was an unconventional particle. This peculiar particle was sensitive to treatment with ribonuclease (no infectivity), and insensitive to deoxyribonuclease, phenol, chloroform, n-butanol or ethanol treatments. In 1967 it become evident that the agent of the potato spindle disease was not a virus, so the term “viroid” was proposed for this free infectious RNA[2][3][4]
Viroids are plant restricted parasites which might lead to conditions that cause significant losses in agricultural crops. Viroids consist of single-stranded, circular and low molecular weight RNA (246-496 nt), they do not possess a protein or membrane shell, however, and given their complex secondary structure they have unusual properties such as resistance to ribonuclease digestion and denaturation. Viroids do not encode any protein to provide specific functions, their mechanism of systemic infection is based on sequestering the nucleic acid synthesis machinery of the host cell and interacting with genetic host factors, and replicating autonomously by rolling circle replication mechanisms (rates of replication may exceed rates of degradation).
The replication intermediates involve dsRNA and are processed in three basic steps: the synthesis of long strings (multiple units) mediated by host DNA-dependent RNA polymerases, which are then processed to monomers and finally ligated into circular units Reference. The most intriguing questions which remain concerning viroid diseases are:1) how do they replicate autonomously in host plants (molecular basis of viroid host range) evading defense mechanisms against infection, 2) how do viroids spread systemically (what host factors participate in specific steps of the replication cycle and trafficking) and 3) how do viroids cause disease symptoms (what are the targets of viroid pathogenicity determinants) without enconding proteins.
Classification Scheme
Comparative analyses of the primary and secondary structures have allowed the classification of viroids into two families; these families exhibit significant differences in their secondary structures, replication pathways, and subcellullar localization[5]
Pospiviroidae family (most viroids) is an acronym derived of potato spindle tuber viroid, the representative species type. Viroids that belong to this family adopt a quasi-rod or rod-like secondary structure, with dsRNA regions separated by unpaired internal single chain loops, in which five structural domains can be distinguished (refernce). The Central Conserved Region (CCR) that contains conserved sites among species from the same genus, and plays a role in viroid replication/processing, the left and right terminal domains (T L , T R ) related to duplication and movement of the viroid, the variable domain (V) that is the most different among viroid species from the same genus, and the pathogenicity domain (P) containing structural elements that contribute substantially to the regulation of symptom expression [6]
Within these domains, there are three which are conserved among species: 1) CCR, formed by two opposing series of nucleotides which are flanked by repeated reverse sequences (the lower and upper branch), 2) Terminal Conserved Region (TCR),located on the upper branch of the left terminal domain, and 3) Terminal Conserved Hairpin (TCH), which is also located on the left terminal domain. The sequence of the CCR, and the presence or absence of TCR and TCH (both regions do not co-existsimultaneously) served to group viroid species of this family into five genera (the type of CCR serves to define the genus). The species are primarly defined on the basis of their primary structure. An arbitrary level of 90% sequence identity is accepted to separate species from strains [7]
Avsunviroidae family.-The three conserved motifs mentioned above are not present in four viroids species that belong to the second family termed Avsunviroidae, which is named after the type species, ASBVd. The classification of this species is based on the G+ C composition and secondary structure predictions, on the morphology of the hammerhead ribozyme (HHRz) and on basis of the LiCl insolubility. These species exhibit the property that the strands of both polarities are able to undergo auto- cleavage by hammerhead ribozymes, and further, two of them (ASBVd, PLMVd) adopt clearly branched secondary structures and tertiary structure elements that help to stabilize the structure.
The classification of viroids into two families is also an important endorsement from another perspective that is linked to replication. Pospiviroidae family members replicate and accumulate in the nucleus and the nucleolus following an asymmetric rolling-circle mechanism, and the host enzymes that could be involved in the replication of the members of this family are enzymes having nuclease activity specifically members of the RNase III family, an enzyme different from the only ligase characterized in plants, and the RNA-dependent DNA polymerase II [8]; while Avsunviroidae family members replicate and accumulate in the chloroplast and replication proceeds via symmetric rolling-circle mechanism using a nucleus-encoded plastid (NEP)-like RNA polymerase, act as self-catalyzing ribozymes performed by hammerhead motifs contained in strands of both polarities, and have been postulated to also possess the property of self-ligation [9]
The Subviral RNA Database (http://subviral.med.uottawa.ca/cgi-bin/home.cgi ) contains updated information and it has an available compilation of forty-three complete genomes for viroids, of which 4700 variant sequences are reported (Pelchat, et al. 2003). The Pospiviroidae family is represented by twenty-seven species and the family Avsunviroidae by four species; several other species have not been classified.
Functional Analysis of Secondary Structure
Viroid genome sizes range between 246-491 nucleotides, but despite its minimal genome and no coding capacity, they modulate replication and direct their intracellular, intercellular and long-distance movement. It is also recognized that they activate defense mechanisms of the plant and that in most cases are these defense mechanisms are insufficient to avoid the expression of symptoms (Daros and Flores 2004b). The viroid RNA molecule has many different associated biological functions condensed in its short sequence and, without encoding proteins, the genome of a viroid must be expressing biological functions directly.
Previous studies have shown that almost every nucleotide is functional and under natural selection. Significant advances in the understanding of viroid-host relationships will result from a comprehensive dissection of the function of viroid genome motifs, and in understanding how they interact with particular cell factors.
Pospiviroidae: Conserved domains, sequence motifs, and hairpins
As previously discussed, Pospiviroidae type species adopt a rod-like secondary structure. The secondary structure became evident after the PSTVd intermediate strain (PSTVd-DI) was sequenced, by thermodynamics predictions, and by electron microscopy (Gross, et al. 1978; Henco, et al. 1979; Sanger, et al. 1976; Sogo, et al. 1973). Later this rod-like structure was proposed for most Pospiviroidae species. The presence of five structural domains was revealed by doing comparative sequence analysis (Keese and Symons 1985).
TR, TL.- These domains are interchangeable between viroids and may play a role in RNA rearrangements during viroid evolution. These domains are also involved in viroid movement in plants (Hammond 1994; Maniataki, et al. 2003).
TL.- The left terminal end of most Pospiviroid and Apscaviroid species contains an imperfect repeat; in PSTVd the respective sequences are from nt 341-358 and from nt 2-21. Each repeat is an imperfect palindrome that allows the formation of a rod-like or Y-shaped structure. Iresine viroid (IrVd) does not have this repeat sequence, it was proposed that this viroid could have originated by deletion of one of the repeats in this region of an ancestral viroid (Spiecker, 1996). By site-directed mutagenesis and Agrobacterium-mediated inoculation, correlations of this region with the first steps of transcription were determined (Hu, et al. 1997).
P (Pathogenicity).- This region has an oligopurine stretch in the upper strand and the corresponding oligopyrimidine stretch in the lower strand of the most pospiviroids. This resulting pairing constitutes a structural region that has low thermodynamic stability. After sequencing PSTVd pathogenicity variants, only slight variations in the sequence in this region were associated with symptom expression, and this region was named VMR (Virulence Modulating Region). This domain is associated with symptom expression and characterized by an oligo (A5-6) sequence. Viroid pathogenicity has long been analyzed in relation to the unique highly base-paired structure of the genomic RNA. Sequence changes located within this domain have dramatic effects on PSTVd symptom expression (Schnolzer, et al. 1985). However, changes in the other four structural domains can also have significant effects on symptom development (Qi and Ding 2003b; Rodriguez and Randles 1993; Sano, et al. 1992; Sano and Ishiguro 1998; Visvader and Symons 1985). Only 40 of the 359 nucleotides in PSTVd genome represent the pathogenicity modulating región. The segments of the 40-60 nt in the upper strand and 200-321 nt in the lower strand, constitute a partially double-stranded region. Binding of the dsRNA-activated protein kinase (PKR) to the P region was discovered earlier as a primary pathogenic event (Diener, et al. 1993). However subsequent work unveiled a more complex scenario controlled by other determinants in the five domains (Sano, Candresse et al. 1992, Qi and Ding, 2003).
CCR.- It is the most highly conserved region among viroids. This conserved central domain is formed by two stretches of conserved nucleotides in the upper and lower strands flanked by an imperfect inverted repeat in the upper strand (McInnes and Symons, 1991). This domain is crucial for the mechanism proposed for replication and processing of PSTVd (+) RNA transcripts longer than unit length. The structure responsible for processing is a four-helices structure with a cleavage/ligation site between nucleotides G95 and G96 (Baumstark, Schroder et al. 1997, Keese and Symons, 1985, Diener 1986, Baumstak and Riesner, 1995 (Baumstark, et al. 1997).
V (Variable).-As the most variable región, this region has the highest sequence variability between closely related viroids. It contains a G:C box with a minimum of three G:C pairs with still unknown function. The boundaries of the V domain have been defined by a change from low sequence homology to the adjacent C and TR domains.
TR.- The sequence duplications of the right terminal end on the smallest CCCVd variant gives rise to longer molecules; these molecules exist in dimeric and circular forms. A small purine/pyrimidine motif in this domain, that is conserved in all the members of the genus Pospiviroid, has been proposed to mediate systemic transport (Gozmanova, et al. 2003; Maniataki, et al. 2003).
From optical melting curves and kinetic studies, it was concluded that viroids denature at unphysiological high temperatures in a highly-cooperative transition due to dissociation of the native rod-like structure which allows the formation of stable branched structures. These stable hairpins are not part of the rod-like structure.
Loop E.- In the center of the CCR a particular internal loop (98-102, 256-261 PSTVd) is located. This loop shows homology to loop E of the eukaryotic 5S RNA. Direct UV irradiation of PSTVd-infected tomato leaves and RNA analysis showed that the loop E is also formed in vivo (Eiras, et al. 2007; Wang, et al. 2007). In the processing structure, the cleavage site is located close to a tetraloop containing the phylogenetically conserved sequence -GAAA- that is also part of loop E (Schrader, et al. 2003).
The Loop E processing domain edits viroid processing, with a linear 148 nucleotide RNA covering the core of the CCR domain and including a 17 nucleotide duplication of the upper strand. Mutations located within this motif alter the host specificity or viroid pathogenicity. This motif is involved in RNA-RNA and RNA-protein interactions and it is found in a wide range of RNAs in nature. The cleavage sites in the structure yield two 3’- nucleotides in each strand. The class III RNAs display a clear preference for substrates with a compact secondary structure. CEVd monomeric (+) transcripts that accumulate in Arabidopsis thaliana have these termini (Gas, et al. 2008). In vitro evidence for its relationship to the viroid ligation process during replication was derived from the incubation of PSTVd (+) RNA with a short repeat of the upper strand of the CCR. Due to the presence of the circular form, it was proposed that the enzymatic cleavage and ligation results from the conserved sequence - GAAA- (Baumstark, et al. 1997). Remarkably, the Loop E motif may also be involved in host specificity viroid pathogenicity (Qi and Ding, 2003), and progeny accumulation (Zhong, et al. 2006). It is has been recognized that this loop is a part of the binding site of two proteins: transcription factor IIIA that activates 5S rRNA transcription, and the ribosomal protein L5 associated with post-transcriptional processing of the RNA from the nucleoplasm to the nucleolus. Recent data showed that L5 and TFIIIA from Arabidopsis thaliana bind PSTVd (+) RNA with the same affinity as they bind their 5S RNA, whereas the affinity of chloroplastic viroid RNA is lower (Eiras, et al. 2011); however loop E is not conserved in all the Pospiviroidae members and it remains to be determined if alternative elements have similar function.
Hairpin I.- In pospiviroids, hostuviroids and cocadviroids, the upper strand of the CCR is able to form a thermodynamically-stable hairpin of nine base pairs named Hairpin I. The hairpin I loop has 14-15 nucleotides and it is a palindrome sequence. Results obtained in an in vivo system using Arabidopsis thaliana lines expressing dimeric transcripts of CEVd have revealed that the cleavage site of the (+) strands is at the upper CCR (Daros and Flores 2004a; Gas, et al. 2007)in a position homologous to the one in PSTVd (Baumstark, et al. 1997). This conserved sequence has been additionally involved in the import of PSTVd to the nucleus (Abraitiene, et al. 2008; Zhao, et al. 2001).
Hairpin II.- This motif is located within the V and TL domain. The helix has a length of 11-12 nucleotides and its composition is rich in G-C content. This structure is absent in CCCVd (Hadidi 2003). Site-directed mutagenesis of PSTVd showed that this motif is critical for infectivity and it acts as a functional element of (-) strand replication intermediates (Loss, et al. 1991; Qu, et al. 1993). Its functional relevance relies on its critical role in infectivity and it is highly conserved among the Pospiviroidae. The hairpin II motif is adopted by a sequential folding of (-) strand intermediates, and it is essential for its template activity in (+) strand synthesis during replication events (Candresse, et al. 2001).
Hairpin III.- This hairpin has only been found in PSTVd (Henco, et al. 1979), and a hairpin IV motif has been identified in CLVd (Owens, et al. 2003), however, their functional roles remain unknown.
Avsunviroidae: Pathogenicity determinants, Hammerhead Ribozymes, and elements of high structure
ASBVd was initially proposed to fold into an elongated conformation (Symons 1981), however, later reports showed that the left-terminal domain is bifurcated (Gast, et al. 1996). Other viroid species belonging to this family also has shown multibranched predictable stable conformations and evidence of this type of conformations in vivo (Ambros, et al. 1998; Ambros, et al. 1999; Pelchat, et al. 2000; Rodio, et al. 2006; Yazarlou, et al. 2012). The avsunviroid genus species does not share remarkable sequence similarity with Pelamoviroid genus; only the conserved sequences from the HHRz are shared. ASBVd is a unique viroid with a G-C content of only ~38% whereas the G-C content of PLMVd is above 52% (http://www.ncbi.nlm.nih.gov ).
Pathogenicity determinants.- Using reverse genetics it has been shown that the nature of the tetraloop sequence (that closes the stem formed between nucleotides 62 and 100) determines the symptom expression of CChMVd. The tetraloop (GAAA) known as an asymptomatic determinant and the tetraloop (UUUC) recognized as symptomatic effector represent the two major examples. Analysis of a population of cDNAs clones showed that symptomatic strains have a 12-13 nucleotide insertion compared with the asymptomatic ones. The loop of nucleotides that closes helices I and II are located outside the HHR and contribute under selection to a faster RNA splicing to allow efficient catalytic self-cleavage.
Hammerhead Ribozymes.-The catalytic activity of these RNAs resides in the ability of their both polarity strands to fold into hammerhead structures, which facilitate splitting of a specific phosphodiester bond through transesterification that yields 5’- hydroxyl and 2’,2’ cyclic phosphodiester termini (Hutchins, et al. 1986; Prody, et al. 1986). HHRzs promote self excision into unit-length strands of the multimeric intermediates ES (Flores, et al. 2001). These ribozymes shape, rather than being composed by, a central conserved core flanked with three double-stranded helices (I, II and III), usually capped by short loops (1, 2,3), resembling a Y in which helices II and III are virtually colinear (Martick and Scott 2006).
Previous studies predicted by in vitro and in vivo analysis confirmed physical contact between loops 1 and 2 and showed that this interaction is critical at low physiological levels of Mg2+ (De la Pena et al., 2003; Khvorova et al., 2003). It is unknown if the ligation step is also mediated in vivo by these motifs. In vitro evidence showed that they have the capacity to perform this reaction (Przybilski, et al. 2005; Przybilski and Hammann 2007a; Przybilski and Hammann 2007b).
Elements of high structure.- Co-variations between the hairpin loops of the most stable secondary structure for PLMVd sequence have a potential base-pairing interaction between these two hairpins, suggesting the adoption at least in some variants of kissing-loop interactions (Ambros and Flores 1998; Ambros, et al. 1998). Recent in vitro probing confirms this interaction (Dube, et al. 2010). Approaches similar to those in PSTVd, have revealed that another element of tertiary structure between two conserved nucleotides is located in the sequence of PLMVd (+) strand. PLMVd-infected leaves irradiated with UV confirmed that this element of tertiary structure exists in vivo (Hernandez, et al. 2006), but its functional role remains unknown. ELVd (+) monomeric linear form has the potential for trafficking to the nucleus and subsequently to the plastids, where replication takes place, suggesting that trafficking could be regulated by an RNA motif restricted to the left terminal domain of ELVd (Pallas, et al. 2012).