Advances in RNA sequencing (RNA-seq) have led to the identification of long non-coding RNAs (lncRNAs). Molecular studies on these molecules have shown that lncRNAs act as important regulators of gene expression at the transcriptional and post-transcriptional level, in both physiological and pathological conditions, yet cell functions of many of identified lncRNAs remain unknown. Here we summarize the achievements on lncRNAs studies including identification of lncRNA interactomes, structural studies and creating reporters for lncRNA activity. We also collect recent data on the involvement of lncRNAs in diseases, the clinical applications of these molecules and discuss major problems remaining the area of lncRNAs pointing future challenges.
1. Overview
Non-coding RNAs (ncRNAs) have been considered as unimportant additions to the transcriptome. Yet, in light of numerous studies, it has become clear that ncRNAs play important roles in development, health and disease. Long-ignored, long non-coding RNAs (lncRNAs), ncRNAs made of more than 200 nucleotides have gained attention due to their involvement as drivers or suppressors of a myriad of tumours. The detailed understanding of some of their functions, structures and interactomes has been the result of interdisciplinary efforts, as in many cases, new methods need to be created or adapted to characterise these molecules. Unlike most reviews on lncRNAs, we summarize the achievements on lncRNA studies by taking into consideration the approaches for identification of lncRNA functions, interactomes, and structural arrangements. We also provide information about the recent data on the involvement of lncRNAs in diseases and present applications of these molecules, especially in medicine.
2. Background
Non-coding RNAs (ncRNAs) were considered as superfluous by-products due to the lack of direct involvement in translation. It is now clear that these molecules play important roles in fine-tuning cellular functions. NcRNA are generally classified into two groups: those longer than 200 nucleotides, long non-coding RNAs (lncRNAs), and those below—small non-coding RNAs (sncRNA) [
1,
2].
Despite their extraordinary numbers, there are more than 50,000 annotated lncRNA genes in the human genome [
3], lncRNAs were considered transcriptional noise until rather recently. LncRNAs are expressed at very low levels and show more cell type- or tissue-specific expression patterns than mRNAs. The biogenesis of lncRNAs is similar to that of mRNAs, where transcription, splicing and polyadenylation are mediated through RNA polymerase II [
4]. The heterogenicity of lncRNAs is further enriched by the existence of isoforms through post-transcriptional alternative cleavage, alternative (or absence of) polyadenylation and/or alternative splicing [
5,
6,
7,
8]. Based on their genomic localisation, lncRNAs can be classified into intronic (transcribed from an intron within a protein-coding gene), intergenic (lincRNA; between two protein-coding genes) or enhancer (eRNA; transcribed from genomic regions distant to gene transcription start site that positively regulate nearest genes’ expression), in addition of being sense, antisense or bidirectional in reference to neighbouring genes [
5,
9].
In contrast to the small non-coding RNAs (sncRNAs), lncRNAs are poorly evolutionarily conserved (sequence-wise) and their cell functions are highly heterogenous [
9]. Some lncRNAs affect chromatin structure by dexterously interacting with both DNA and chromatin-modifying proteins creating scaffolds for DNA-protein complexes [
9], other lncRNAs can bind neighbouring genomic loci from their place of transcription to initiate genomic imprinting. Large lincRNA can also control gene expression by recruiting enzymes participating in histone modifications [
10]. Additionally, lncRNAs can regulate translation, splicing and RNA stability through interaction with mRNAs [
9]. Some lncRNAs seem to work as sponges inhibiting the activity of sncRNAs i.e., microRNAs (miRNAs) [
11,
12], and some bind mRNA or proteins [
13,
14,
15] what results in becoming stabilisers/degrons, translocators or modulators of their activity [
16,
17,
18]. The interaction of lncRNAs with all other macromolecules is achieved through structural recognition and/or base-pairing [
5], making lncRNA either decoys, signals or guides [
9]. Of note, some lncRNAs have also been found to encode peptides within small ORFs (smORFs; containing less than 100 codons) [
19,
20].
Furthermore, only around 1% of human known lncRNAs have been characterised to date. Progress in this field is difficult due to their limited expression in the cell, low level of lncRNA sequence conservation, and a large variety of mechanisms of action [
21].
Furthermore, processes like regular co-transcriptional splicing [
22] or post-transcriptional back-splicing [
23,
24] can produce another class of lncRNAs—the circular RNAs (circRNAs). The back-splicing circRNA can be formed from within an intron (ciRNA, circular intronic RNA), one or more exons and exon fragments with intron (elciRNA) [
25]. The differences in biogenesis of circRNA might be important for their localization and thus functions, for example ciRNAs as well as elciRNAs mainly accumulate in the nucleus and are thought to regulate transcription [
26,
27], while exonic circRNAs are mostly present in the cytoplasm where they seem to act in post-transcriptional gene regulation e.g., as miRNA-sponges [
25,
28].
LncRNAs can also be classified by the function they perform—imprinted lncRNAs, disease-associated lncRNAs, pathogen-induced lncRNAs, miRNA sponges and bifunctional RNAs [
10]. Imprinted lncRNAs have an important role in reinforcing local chromatin organisation, resulting in one of the autosomal alleles of a gene being epigenetically silenced [
29]. Disease-associated lncRNAs are those whose expression is postnatally silenced in most tissues but re-activated during regeneration or pathophysiological conditions such as tumorigenesis [
30]. Pathogen-induced lncRNAs are modulated as a response to invading microorganisms, such as
Helicobacter pylori, and
Salmonella enterica [
31,
32]. Bifunctional lncRNAs can have more than one role in gene expression and, in some cases, have smORFs [
33].
This review outlines the available methods and tools currently used to study lncRNAs biology in terms of structure, interactome, activity and function. We have also summarized the mounting data on the potential applications of lncRNAs especially in medicine.
3. Conclusions
LncRNAs represent a new class of RNA that fine-tune complex physiological processes and the onset of diseases. The detailed understanding of their functions, structures and interactomes is challenging as the conventional methods used to study mRNA functions are inefficient for lncRNAs. In this review we summarized the interdisciplinary efforts undertaken to characterise lncRNAs. In many cases, a combination of several techniques was successfully applied; for example, for predicting structure of these molecules the bioinformatic predictions, followed by enzymatic or chemical probing of lncRNAs and HTS. Moreover, the ability of lncRNAs to interact with DNA, RNA, and proteins to exert their functions, is raising their complexity to a higher level. To explore the interactome of lncRNA techniques based on crosslinking RNAs with their partners, employing known RBPs, labeling RNA–protein complexes or adapting chromatin IP with oligonucleotide probing of RNA are coupled with sequencing or MS, depending on the target molecules. Yet, the nature and dynamics of such interactions need to be elucidated in the future. Functional studies on lncRNAs required new tools dedicated for their versatile activities in the cell. For the determination of lncRNAs function and map functional elements currently used methodologies target certain regions or whole lncRNAs gene through RNAi, ASOs, PNAs or CRISPR/Cas9. The improvement of these methods is of great value, as lncRNAs are connected with a wide spectrum of diseases. Exploring their biology, disease-related lncRNAs will gain greater relevance as potential biomarkers in cancers and for personalized medicine, especially for gene therapy.
This entry is adapted from the peer-reviewed paper 10.3390/cancers13112643