Long noncoding RNAs (lncRNAs) have been shown to play crucial roles in various life processes, including circadian rhythms. Although next generation sequencing technologies have facilitated faster profiling of lncRNAs, the resulting datasets require sophisticated computational analyses. In particular, the regulatory roles of lncRNAs in circadian clocks are far from being completely understood.
1. Introduction
The circadian clock, an endogenous time-keeping mechanism, regulates unique 24-h rhythms of metabolism, physiology and behavior
[1]. The suprachiasmatic nucleus (SCN) of the hypothalamus hosts the master clock that drives the circadian rhythms in various tissues and organs
[2]. The malfunction of circadian clocks has been closely linked to health problems, such as sleep disorders, mental diseases, and cancers
[3]. A variety of model organisms including the fruit fly (
Drosophila melanogaster) and the zebrafish (
Danio rerio)
[4] have been used to study the operating mechanisms of circadian clocks. The fruit fly is an ideal organism to investigate circadian clocks in insects
[5] because of its easy genetic manipulation, breeding in a controlled environment, and monitoring of locomotor activities
[6]. The zebrafish is also an attractive organism to study the circadian clock in vertebrates
[7,8,9][7][8][9]. The ease of obtaining a large number of zebrafish early embryos enables investigation of the onset of circadian rhythmicity
[4]. Recently, the genetic dissection of the zebrafish circadian clock has demonstrated that zebrafish share conserved transcription/translation negative feedback loops with fruit flies, mice and humans
[10,11,12][10][11][12]. In particular, zebrafish embryos are transparent and do not require feedings for several days post fertilization
[13,14][13][14]. Hence the confounding effects of feeding can be avoided
[15]. As such, zebrafish embryos/larvae allow for studying circadian rhythms independent of feeding.
lncRNAs represent a diverse set of noncoding RNAs that contain more than 200 nucleotides
[16]. Interestingly, several lncRNAs have been implicated in regulating circadian rhythms, including circadian rhythms in cancer cells
[17,18][17][18]. The expression patterns of nearly 100 lncRNAs were shown to be closely linked to the synthesis of hormone melatonin in the rat pineal gland
[19]. Melatonin is an integral component of the circadian clock system
[7]. The testis is responsible for several key biological functions, such as producing the germ cells and circulating testosterone, the most active androgen
[20]. However, the debate over the existence of circadian rhythms in the testis is still unresolved. For example, some studies suggest a lack of circadian clocks in the testis
[21], whereas other studies strongly support the presence of circadian activity in the testis
[22]. These findings inspired us to investigate the lncRNA-mediated circadian activities in both the pineal gland and testis
[8], thereby uncovering 586 and 165 rhythmically expressed lncRNAs in zebrafish pineal gland and testis, respectively. In particular, 26 rhythmically expressed lncRNAs were shown to be coexpressed in both organs
[8]. We hypothesize that some lncRNAs are also rhythmically expressed in zebrafish larvae.
Although lncRNAs do not encode canonical proteins, recent studies suggest that they are involved in numerous fundamental biological processes, including determination of cell fate
[23], gene regulation
[24], transcription, and various diseases
[24]. In fact, thousands of lncRNAs have been identified
[25] from a diverse set of organisms, including humans
[26,27][26][27]. For example, GENCODE v7 contains14,880 human lncRNA transcripts
[28], while the ZFLNC lncRNA database
[29] catalogues over 21,000 zebrafish lncRNAs. Interestingly, numerous lncRNAs
[30] have been shown to encode micropeptides, consisting of approximately 100 amino acids
[31]. The micropeptides differ from the functional proteins that often contain more than 400 amino acids
[32]. The lncRNA-encoded micropeptides have been demonstrated to regulate various biological processes and activities, such as muscle function, transcription, and mRNA stability
[33]. Toddler, a lncRNA-encoded microspeptide, regulates Apelin receptors in order to regulate cell movement in zebrafish
[34]. A skeletal muscle-specific lncRNA-encoded micropeptide, myoregulin, (MLN) was found to regulate muscle performance
[35]. A recent study
[36] discovered a conserved 79-amino acid long microprotein, FORCP, encoded by lncRNA LINC00675. A 60-amino acid long micropeptide ASRPS, encoded by lncRNA LINC00908, contained in small open reading frames
[37,38][37][38]. A micropeptide, miPEP155, encoded by lncRNA MIR155HG was shown to suppress autoimmune inflammation
[30]. In particular, our computational analysis recently revealed hundreds of coding lncRNAs in zebrafish
[39].
2. LncRNAs Display Circadian Rhythmicity in Zebrafish Larvae
lncRNAs have been implicated in numerous biological processes
[24,65][24][40]. Although previous studies have uncovered coding potentials
[39], expression profiles
[26,27[26][27][41],
66], and numerous rhythmically expressed lncRNAs from different zebrafish organs
[8], our understanding of the involvement of lncRNAs in circadian regulation remains far from complete. Despite a few studies
[17,67][17][42] investigating the circadian regulation of lncRNAs, the effect of light on rhythmically expressed zebrafish larval lncRNAs has not been studied. Although our previous study
[8] identified 26 rhythmically expressed lncRNAs coexpressed in zebrafish pineal gland and testis, further research was needed to investigate how many of these 26 lncRNAs are rhythmically/circadianly expressed in zebrafish larvae.
In this study, we generated time-course transcriptome profiles of zebrafish larvae employing the state-of-the-art bioinformatic tools to investigate circadianly expressed lncRNAs under both DD and LL conditions and uncovered circadian dynamics regulating the expression profiles of the zebrafish larval lncRNAs. In comparison to a recent study
[19] that investigated circadian regulation of over one hundred lncRNAs in the rat pineal gland, including elucidation of the operating mechanism of circadian clocks of eight lncRNAs in the suprachiasmatic nucleus (SCN), our study identified thousands of zebrafish larval transcripts under both DD and LL conditions. Specifically, we investigated the expression profiles of 3220 lncRNAs under DD and LL conditions, identified 578 circadianly expressed lncRNAs, and annotated them with GO, COG, and KEGG pathway enrichment analyses. The computational findings suggest that most of these circadianly expressed larval lncRNAs potentially contribute to crucial biological functions.
We compared the circadianly expressed larval lncRNAs with lncRNAs from the pineal gland and testis
[8], and found that zebrafish larvae coexpress nine circadianly expressed lncRNAs in both the pineal gland and testis under the DD condition, whereas zebrafish larvae coexpress 12 circadianly expressed lncRNAs with in both the pineal gland and testis under the LL condition, which belong to the 26 lncRNAs coexpressed in zebrafish pineal gland and testis we previously reported
[8] (
Figure 1A,B). We investigated peptides encoded by these coexpressing lncRNAs to predict their 3D models and functions. In addition, we performed a conservative analysis of the larval lncRNAs with humans, mice, and fruit flies. We found that zebrafish larvae share as many as 35 and 42 lncRNAs with humans under DD and LL conditions, respectively, while zebrafish larvae share as many as one and four lncRNAs with mice under DD and LL conditions, respectively. Hence, we selected the five circadianly expressed lncRNAs shared by these three species, investigated the corresponding lncRNA-encoded peptides, and revealed hundreds of peptides encoded by these 5 lncRNAs. We selected these conserved peptides and investigated their 3D models and corresponding known domains from the Protein Data Bank and uncovered several peptides sharing close resemblance in terms of α-helix, β-strand, and random coils.
Figure 7. Circadianly expressed zebrafish larval lncRNAs are coexpressed in the zebrafish pineal gland and testis. (A) Circadianly expressed lncRNAs coexpressed in larvae, pineal gland and testis under DD condition. (B) Circadianly expressed lncRNAs coexpressed in larvae, pineal gland and testis under LL condition. (C–E) Expression profiles of representative lncRNAs under DD condition: three zebrafish larval morning (CT0 and CT 4) lncRNAs (C), two zebrafish larval evening (CT8 and CT12) lncRNAs (D), and four zebrafish larval night (CT16 and CT20) lncRNAs (E). (F–H) Expression profiles of representative lncRNAs under LL condition: five zebrafish larval morning lncRNAs (F), four zebrafish larval evening lncRNAs (G), and three zebrafish larval night lncRNAs (H).
Although our framework, which combines novel experimental data with computational analysis, brings unprecedented insights into the circadianly expressed lncRNAs in zebrafish larvae, the study is constrained by a few limitations inherent in the bioinformatic analysis. For example, some of the peptides predicted in this study are more than 100 amino acids long. Hence, additional studies are required to investigate lncRNA-encoded micropeptides that usually contain less than 100 amino acids
[31]. However, our approach, which combines biological data and computational techniques, can be applied to investigate both micropeptides and canonical peptides. Second, this study employs RNA-seq technology to investigate the lncRNAs. However, the RNA-seq technology has its own set of shortcomings
[68][43] and often fails to identify certain lncRNAs due to the constraints imposed by poly(A) tails
[69][44]. Third, although our comparative and conservative analysis reveals numerous interesting coexpressing/conserved lncRNAs, the numbers of such lncRNAs are far from complete. It is likely that there are more larval lncRNAs coexpressed in different organs/tissues of zebrafish or conserved with other species. However, due to the lack of experimental data and sequencing information of other tissues, finding a larger number of coexpressing/conserved lncRNAs remains an open research direction. Fourth, the FIMO tool only allows for a few specific
p-values to detect the E-Box, D-Box and RORE elements, which may cause multiple false positives. As such, the regulation of circadianly expressed lncRNA by the E-Box, D-Box and RORE requires confirmation by wet-lab experiments. In fact, all the computational predictions require additional biological experimental validation. Fifth, although a zebrafish larva embodies a whole zebrafish, it might not be developed sufficiently to provide the best possible lncRNA expression profiles. A larva needs to undergo a long developmental process before developing an organ such as a testis, and the larval pineal gland and the adult pineal gland may use different sets of lncRNAs. It is possible that some of the lncRNAs expressed in a zebrafish larva may not be expressed in either the adult pineal gland or adult testis. Hence, comparative analysis of zebrafish larval lncRNAs with those in the pineal gland and testis requires additional experimental validations. Sixth, for several larval lncRNAs identified by similarity with ZFLNC lncRNAs, additional research is required to map them to the correct known identifiers in the Gene Bank or Ensembl, as the ZFLNC database lacks identifiers for thousands of lncRNAs. Finally, the effect of light on lncRNAs also requires further investigation. Despite all the limitations, our study uncovers interesting patterns derived from real experimental data. In particular, we predicted 3D models and functions of the conserved peptides encoded by the coexpressing/conserved lncRNAs. To the best of our knowledge, this is for the first time that hundreds of circadianly expressed lncRNAs have been revealed in zebrafish larvae. Our integrative framework, which combines data and bioinformatics analysis, can be expanded to investigate the circadian regulation of a diverse set of noncoding RNAs, and should help circadian biologists to select lncRNAs of interest prior to conducting time-consuming wet-lab experiments.