RNA-binding proteins (RBPs) are a special class of proteins that interact with RNA providing an imperative checkpoint to fine-tune gene expression at the RNA level, thus presenting a key component of post-transcriptional regulation. RNA-binding proteins interact with the untranslated regions of RNAs that have cis-acting regulatory functions forming dynamic ribonucleoprotein complexes that control the fate of RNA. These RBPs regulate the synthesis, editing, processing (including capping, splicing and polyadenylation), transport and localization, storage, translation and turnover of RNA in diverse systems including mammals, yeast and plants. Essentially, post-transcriptional regulation is gaining increasing momentum as a critical component in adjusting global cellular transcript levels during development and in response to environmental stresses. Despite the technical challenges faced in plants in large-scale studies, several hundreds of RBPs have been unearthed and elucidated globally over the past few years. Recent discoveries have brought into light RBPs lacking classical RNA-binding domains, which could not be revealed using in silico analysis. Uncovering the hidden RBP repertoires will advance our understanding of the RBP-RNA interaction universe and has will set the pace towards potential biotechnological applications of RBPs.
RNA-binding proteins (RBPs) are ubiquitous in living systems from unicellular to multicellular organisms. Apart from their apparent role in determining the fate of RNA molecules, they play crucial roles in a myriad of cellular processes including the regulation of gene expression in response to different environmental stimuli. These RBPs permit cells to rapidly alter their expression patterns in response to various environmental stimuli. A quick response is particularly detrimental in unicellular organisms that are largely dependent on their ability to acclimatize to environmental changes and to survive.
There are several techniques, which have been used in the identification of RBPs, each with its own advantages and limitations. Most notably, several homology-based in silico prediction techniques that employ computational assignment strategies that heavily rely on detecting structural similarities and the presence of classical RB domains have unearthed a few hundred RBPs. Most recently, a homology-based technique that uses machine learning, with only the protein sequence as input, has been used to uncover nucleic acid-binding proteins, including RBPs. However, these methods have inborn caveats, such as providing a limited proteome coverage and the occurrence of high false positives. For instance, RBPs that are devoid of classical RBDs will be omitted entirely, thereby conferring limited proteome coverage to these in silico techniques.
Due to the glaring limitations of in silico RBP identification techniques, high-resolution proteome-wide studies have closed the gap and led to an increase in the identification of candidate proteins implicated in RNA binding. For instance, genome-wide protoarrays and fluorescent RNA probes have previously been used in the identification of RBPs in yeast. In some studies, an MS-based technique that employs aptamer-tagged RNA as bait has been used to capture RBPs. Most interestingly, the fairly recent mRNA interactome capture (RIC) technology, which has been optimized for plants, furnishes the best example. This technology has unraveled several hundreds of novel RBPs lacking classical RBDs, such as proteins involved in intermediary metabolism. The RIC technology is based on UV-crosslinking and fixing proteins to their putative mRNA targets. Following purification by affinity capture, candidate RBPs are then identified using tandem mass spectrometry. The strategy yielded well over a thousand different proteins together with several hundreds of proteins that were functionally classified as RNA-binding with known RBDs or harbored orthologs identified in mammals, C. elegans, or S. cerevisiae. In addition to these classical RBPs, the RIC technology unearthed over 1800 novel candidate RBPs that are devoid of typical RBDs, thereby broadening our understanding of the RBP repertoire in general.
Despite its celebrated success in the discovery of putative RBPs, the presence of ribosomal RNA and DNA contamination is a consistent limitation that is associated with RIC. A modified RIC protocol, known as enhanced RIC (eRIC), which tries to minimize the caveats associated with RIC, has recently been established. This technique employs the use of a locked nucleic acid (LNA)-modified capture probe, which purportedly offers greater specificity and heightened signal-to-noise ratios relative to unmodified RIC. Due to the increased signal-to-noise ratios associated with eRIC, this technique allows for the detection of more RNA-protein interactions that would otherwise evade the analysis of unmodified RIC. For instance, one study noted that in cells treated with a potent RNA demethylase inhibitor, eRIC detected m6A-responsive RBPs that evade RIC detection. Besides, the benefits of this technique and other recent modified versions of RIC are yet to be gained in plant systems.
Various proteins involved in intermediary metabolism or rather enzymes of intermediary metabolic process were detected in mRNA interactomes. Catalogues of mRNA-bound proteins now suggest a more general functional relevance of enzymes moonlighting as RBPs supporting an earlier RNA, enzyme, and metabolism (REM) hypothesis that proposes a link between metabolism and RNA-based regulation of gene expression. Notably, in addition to proteins controlling the fate of bound mRNA, the RNAs could in turn, serve as regulators of enzymatic activity, possibly through competition or allosteric activation/repression, or by acting as scaffold for the assembly of enzyme complexes.
Comparatively, the Arabidopsis, mammalian, C. elegans and S. cerevisiae systems, uncovered a common set of enzymes that have a role in intermediate metabolism, including enzymes involved in the glycolysis and tri-carboxylic acid cycles. This inter-specific comparative analysis also revealed some distinct differences in the RBP repertoires of each organism, suggesting that these RBPs are also tissue and species-specific. Interestingly, some of these enzymes of the intermediary metabolic pathway were modified under drought stress conditions, indicative of a link between post-transcriptional gene regulation and stress-induced metabolic changes either via RBPs regulating their own mRNAs or vice versa. In response to drought stress, four carbohydrate metabolism enzymes, glyceraldehyde 3-phosphate dehydrogenase C-2 (GAPDH), aldehyde dehydrogenase 7B4, pyruvate dehydrogenase E1 component and aconitase, which are also responsive to abscisic acid (ABA) stimulus and water deprivation, are among the enzymes that were detected as differentially regulated at their RNA interaction levels. For example, at the protein level, the expression of glyceraldehyde 3-phosphate dehydrogenase increases in response to cold stress and in response to drought stress, an increase was noted at post-transcriptional level, denoting a potential transcriptional rise of its target RNA.
Increasing evidence on the enzyme-RNA interaction sheds light on the role of metabolic enzymes in dual functionality. For example, in non-plant systems, in vivo and in vitro evidence confirm the existence of RNA-binding activities within the nicotinamide adenine dinucleotide (NAD)-binding pocket of GAPDH. Just like in the mammalian system, eight GAPDH NAD-binding and GAPDH C-terminal domain-containing proteins were observed in plants. GAPDH has been shown to bind to diverse RNA species, including AU-rich elements, tRNAs and telomerase RNA component (TERC). Based on evidence from other systems, it is imperative to suggest that the same principle is conserved in plants, and that GAPDH potentially interact with RNA through its NAD-binding pocket.