FibroDB is an easy-to-use web application that allows end-users to explore expression changes of protein-coding and lncRNA genes using the three most-commonly used normalized expression values (CPM, RPKM, and TPM).
1. Introduction
Fibroblasts are the most common cell type in the connective tissue and are found throughout the mammalian body
[1][2]. They synthesize and secrete extracellular matrix (ECM) proteins and collagens to maintain the tissue structure. They can be easily isolated from each tissue by simply explanting a piece of tissue in a cell-culture dish as fibroblasts outgrow from such tissue and adhere to the plastic cell-culture dish
[3]. In vitro, the morphology of fibroblasts is distinct and described as large, flat, and spindle-shaped cells
[4][5]. Upon injury (e.g., an open-wound injury in the skin, myocardial infarction), fibroblasts are activated (called myofibroblasts), which will result in the proliferation of myofibroblasts as well as the deposition of excessive ECM, leading to progressive tissue scarring, fibrosis, and organ dysfunctions
[6][7][8]. The characterization of fibroblasts has been a focus of intensive research for many years
[9][10][11][12]. However, the main outstanding issue in the field is that there is no single gene/protein marker that can describe fibroblasts since they are a collective term to describe heterogeneous populations of cells
[13][14][15].
The use of RNA sequencing (RNA-seq, including single-cell RNA-seq (scRNA-seq)) technology has identified a large number of non-protein-coding (ncRNA) genes. When the length of an ncRNA is longer than 200 nucleotides (nt), this transcript is categorized as a long non-coding RNA (lncRNA)
[16][17][18][19]. LncRNAs bind macromolecules (DNA, RNA, and proteins) to regulate various cellular processes, including epigenetics, transcription, post-transcriptional modifications, and translation
[20]. Dysregulations of lncRNA expression and functions are associated with many diseases
[21]. To accommodate the increased interest to study lncRNAs, a number of lncRNA databases have been introduced
[22], which allow for the user to screen for tissue-specific lncRNAs, for example. Yet, there is no lncRNA database focused on fibroblasts currently available.
2. Discussion
It is found that (i) a large number of lncRNA genes are expressed in fibroblasts and during fibrosis; (ii) compared to protein-coding genes, the overall expression levels of lncRNA genes are much lower in fibroblasts and TGF-β-stimulated fibroblasts; (iii) although TGF-β stimulation is a common mechanism in fibrosis, only very few protein-coding and lncRNA genes share similar profiles between cardiac and pulmonary fibroblasts; and (iv) knockdown of the lncRNAs, LINC00622 and LINC01711, resulted in gene expression changes associated with cellular and inflammatory responses, respectively. However, further functional and mechanistic studies are required to understand the importance of these lncRNAs in fibrosis.
As with any other study, there are limitations to our study. First, all the RNA-seq data analyzed are of poly A-enriched sequencing, although all RNA-seq data are strand-specific sequencing. Thus, it is possible that lncRNAs without poly A tails may have higher expression than those with poly A tails. Second, we only focused on the known lncRNA genes based on the latest annotation provided by the Ensembl database. Thus, it is possible that novel lncRNA genes might have higher expression than those of protein-coding genes. Third, only one time point after the stimulation with TGF-β was investigated. More mechanistic studies are needed with longer stimulation to understand the impact of silencing the candidate lncRNAs identified here.
Although it is now common to employ scRNA-seq to assess the heterogeneity of cells
[11][23][24][25], such an approach is not suitable for studying the functions of lncRNAs, as it will be difficult to identify minor populations of cells with low expression of lncRNA genes as shown in this study. Thus, it is intentionally excluded scRNA-seq data from FibroDB. Since the interest to study fibrosis has increased in recent years, the commands and snakemake
[26] pipelines are available via the GitHub repository to allow further analysis of similar RNA-seq data.
There are several databases currently available that include expression profiles of lncRNAs. Most of these databases include the expression profiles derived from RNA-seq data of whole tissues (normal and/or tumors as in the case of C-It-Loci
[27], LncBook
[28], lncRNAtor
[29], LncExpDB
[30], RefLnc
[31]) and cell lines (LncExpDB
[30], lncRNAtor
[29], RefLnc
[31]), which is not ideal to understand the expression profiles of a certain cell type, especially in normal physiological conditions. To solve this problem, two databases that focus on a specific cell type is available: ANGIOGENES for endothelial cells
[32] and RenalDB for cells in kidneys
[33]. FibroDB is the first lncRNA database focused specifically for fibroblasts and during fibrosis. FibroDB web application allows the users to quickly search for lncRNAs differentially expressed in several experimental conditions. Furthermore, comparisons among different experimental settings can be carried out to narrow down the list of differentially expressed lncRNAs during fibrosis in different tissues.
This entry is adapted from the peer-reviewed paper 10.3390/ncrna8010013