Pol II Pausing during Daily Gene Transcription

Pol II Pausing during Daily Gene Transcription: Comparison

Please note this is a comparison between Version 1 by Xiaodong Li and Version 2 by Rita Xu.

Clock proteins and their collaborating transcription factors often act as distal enhancers to regulate the rhythmic transcription of gene promoters. Those transcription factors need to interact with the mediator complex and general transcription factors near the transcription start site to finally control transcription. Pol II pausing, which is determined by Pol II recruitment, pause release, and premature transcription termination near the transcription start site, plays a critical role in influencing the final transcription output.

circadian rhythms
transcription
pausing

1. Introduction

Molecular clockworks consisting of feedback loops of core clock genes drive cell-autonomous circadian oscillation in various species [1]. In mammals, the transcription factors (TFs) CLOCK and BMAL1 dimerize to activate the transcription of Per1/2 and Cry1/2, whose protein products are repressors that inhibit CLOCK/BMAL1 action through negative feedback [2]. While the post-translational regulation of clock proteins play critical roles in setting the clock pace ^[3][4][3,4], the prime mover of circadian oscillation is thought to be transcription [5]. High throughput technologies such as microarray [6], RNA-seq ^[7][8][7,8], and ChIP-seq [9] enable the detailed characterization of gene rhythms and the genomic binding of clock proteins, allowing for in depth analyses of circadian rhythm generation at the level of transcription.

The binding sites of clock proteins are located within open chromatin regions established by tissue-specific pioneer TFs (tsTFs), and thus are typically tissue-specific [10]. Chromatin is known to be a barrier to transcription, and DNA sequences are often not accessible to many TFs, with the exception of tsTFs that are sufficient to trigger enhancer competency within chromatin. Furthermore, tsTFs allow subsequent binding by other TFs, including clock proteins. Some tsTFs (e.g., HNF4a ^[11][12][11,12]) and ubiquitous TFs (u-TFs, e.g., RELA/p65 ^[13][14][13,14]) interact with and recruit clock proteins to their cis elements. CLOCK/BMAL1 can also facilitate the binding of some tsTFs, leading to the suggestion that CLOCK/BMAL1 acts like a pioneer-like TF ^[10][15][10,15]. Like many TFs [16], clock proteins recruit cofactors to modify histones and remodel nucleosomes to regulate transcription. Clock proteins and their cofactors form a complex with an M.W. over 1 MDa, and deficiencies in some cofactors alter clock dynamics [17]. For example, clock proteins in both Drosophila and mammals recruit the TIP60 complex to regulate clock oscillation ^[18][19][20][18,19,20]. To control Pol II transcription at the transcription start site (TSS), TFs require the mediator complex [21] to interact with general transcription factors (GTFs) that are present at gene promoter [22]. The mediator subunits interacting with the clock protein complex remain to be determined.

Traditional studies addressed how TFs and cofactors direct the mediator complex to assemble the pre-initiation complex (PIC) for transcription initiation and reinitiation at the TSS [23]. Distal enhancers bound by TFs and promoters are thought to be brought to proximity via chromatin looping, which is a process assisted by proteins such as cohesin and CTCF [24]. The traditional view of transcription, however, has difficulties in explaining new findings such as transcription bursting, which represents Pol II initiation and multiple rounds of reinitiation [25]. Imaging studies in single cells revealed that the transcription of many genes, including clock genes [26], is stochastic and of low frequency. Transcription often toggles between active and inactive states within a cell, and the active state is characterized by transcription bursting followed by a prolonged dormancy of the inactive state [27]. Bursting could be readily explained by the formation of a transcription hub: cluster and/or molecular condensate of TFs, cofactors, mediators, and Pol IIs that permit multiple rounds of Pol II initiation ^[28][29][28,29]. Recent studies revealed that TFs often contain intrinsically disordered low-complexity domains, whose interactions induce the formation of transcription clusters and even molecular condensates, the latter via “lipid-lipid phase separation” ^[28][29][28,29]. Transcription bursting also requires pause release [30], which refers to the process of P-TEFb-licensed Pol II elongation to overcome the +1 nucleosome barrier and transcribe into the gene body [31]. Initiated Pol II travels only a short distance; it then enters the state of pausing, wherein Pol II stays paused downstream of the TSS via the actions of pausing factors (DSIF and NELF) and the +1 nucleosome ^[31][32][31,32]. P-TEFb is a component of the super elongation complex (SEC) ^[33][34][33,34] that releases paused Pol II for elongation, permitting Pol II reinitiation to achieve transcription bursting. Initiated Pol II is also subject to premature transcription termination at the 5′ end of genes, which can decrease Pol II pausing ^[35][36][35,36].

2. Transcription Regulation Is the Main Driving Force for Gene Expression Rhythms

First demonstrated for Per in flies ^[37][39], core clock genes exhibit robust daily changes in their mRNA expression. Owing to rapid co-transcriptional splicing, the pre-mRNA level can be used as the surrogate for transcription activity. RNAse protection assays against Per pre-mRNA and mRNA showed that the Per mRNA rhythm in Drosophila is mainly driven at the level of transcription ^[38][40]. The post-transcriptional regulation of mRNA stability also contributes to the Per mRNA rhythm and is sufficient to confer rhythmic mRNA expression to other genes ^[39][40][41,42]. In mammals, core clock genes such as Per1/2 also exhibit robust daily changes in mRNA levels ^[41][42][43,44]. A pre-mRNA measurement implicated that rhythmic transcription is the driving force for the mRNA rhythms of core clock genes and many other genes ^[43][44][45,46]. Deep sequencing studies evaluated the contribution of transcription regulation to mRNA rhythm generation in a genome-wide manner. One study estimated that 22% of mRNA rhythms are driven by rhythmic transcription [9]. Later studies with a high sequencing depth and kinetic modeling increased the estimate to about 70–80%, whereas rhythmic degradation contributes to the mRNA rhythms of 30–35% genes ^[45][46][47,48]. Nuclear export, which is another post-transcriptional regulation step, contributes to rhythm generation for 10% of rhythmic transcriptomes ^[47][49]. Overall, rhythmic transcription is deemed as the main driving force for gene rhythms [5].

3. Both the Intrinsic Tissue Clock and Extrinsic Cues Can Regulate Gene Expression Rhythms

Clock genes typically harbor multiple cis elements for clock proteins, which also have numerous other binding sites across the genome. Clock proteins thus also regulate many other genes. Clock genes and other genes are also influenced by extrinsic cues, which are often rhythmic in wildtype animals. Such cues include body temperature (T_b) ^[48][50], feeding ^[49][51], and communication signals from other tissues (including the autonomic nervous system) ^[50][52]. The extrinsic cues can engage TFs as well as post-transcriptional mechanisms to regulate gene rhythms, including those of clock genes. For example, daily changes in T_b drive rhythmic HSF1 expression to regulate gene transcription ^[51][53]. The T_b rhythm also drives Cirbp expression to post-transcriptionally regulate clock dynamics ^[52][54]. Besides the T_b rhythm, blood-borne cues also regulate clock dynamics; serum and plasma can activate multiple signaling pathways to impact clock genes ^[53][54][55][55,56,57]. For example, rhythmic cues in plasma activate SRF, which regulates the transcription of the clock gene mPer2 ^[55][57]. Certain blood-borne cues impacting clock dynamics are heat labile, implying that they are proteins ^[56][58]. Lipids can also serve as inter-tissue communicating cues. For example, phosphatidylcholine is synthesized by the liver and released into plasma to activate PPARα in muscles ^[57][59]. Overall, clock proteins and many other TFs exhibit daily changes in their actions. Like clock proteins, other TFs also have thousands of genomic binding sites in various tissues. Therefore, they potentially can regulate numerous genes besides clock genes. Gene expression rhythms are thus driven by both clock proteins and other TFs. How clock proteins and other TFs work together to control gene rhythms was the focus of recent studies in various peripheral mouse tissues.

4. Clock Proteins Typically Collaborate with Other TFs to Regulate Transcription Rhythms

Clock proteins and other TFs often collaborate to regulate target genes [10]. The independent contribution of the clock to gene rhythms is rather limited ^[58][59][60,61]. In studies that reconstitute clock oscillation (RE) in specific tissues of Bmal1-deficient mice ^[58][59][60,61], it was shown that only 10% of the rhythmic transcriptome can be restored in the livers of liver-RE mice ^[58][60]. However, that is not to say that the liver clock regulates only 10% of the rhythmic transcriptome in wildtype mice. In fact, the disruption of the liver clock disturbs about 90% of the gene rhythms in mouse liver ^[60][61][62,63]. Overall, those results indicate that the majority of gene rhythms are regulated in a combinatorial manner by both the intrinsic clock and the TFs engaged by extrinsic cues. By comparing the liver gene rhythms in Bmal1 KO, liver-RE, and wildtype mice under ad libitum feeding versus nighttime restricted feeding, it was shown that the mRNA rhythms in the livers of the wildtype mice can be partitioned into four parts based on their modes of regulation ^[62][64]. Some rhythms can be driven by the intrinsic liver clock alone (13.7%); some can be driven by rhythmic feeding cues alone (17.5%); some require not only the intrinsic clock, but also rhythmic feeding cues (34.5%); while the rest (34.4%) require both the intrinsic clock and rhythmic cues from other tissues (and their clocks). Those results indicate that for the regulation of a majority of gene rhythms, there is a mandatory requirement for clock proteins to collaborate with other TFs. For example, feeding engages the TF CEBPB to coregulate BMAL1 target genes, and CEBPB deficiency disrupts the rhythms of some BMAL1 target genes that are also regulated by feeding ^[62][64].

5. The Need to Study Pol II Pausing Regulation near the TSS

Clock proteins and other TFs occupying distinct enhancers of the same gene can collaborate through chromatin looping to regulate transcription. Techniques such as Hi-C and CHIA-PET revealed daily changes in the long-range interactions between distinct enhancers bound by clock proteins and other TFs, respectively, and between those enhancers and gene promoters ^[10][63][10,65]. The collaboration between clock proteins and other TFs can also occur at the same enhancers. Indeed, TFs often exhibit cooperative binding at the same enhancers to increase the affinities of the two factors to their respective motifs. However, cooperative binding does not necessarily lead to coactivation. For example, HNF4a and RELA/p65 can recruit CLOCK/BMAL1 for genomic binding ^[11][13][11,13], but can transrepress its transcription activation ^[12][14][12,14]. Such interactions between TFs at same and/or distinct enhancers pose serious challenges in elucidating how clock proteins contribute to the final transcription output at the TSS. Indeed, the genomic binding of CLOCK/BMAL1 at enhancers is often not sufficient to specify the rhythm phase and amplitude, and cannot confer rhythmicity to some target genes ^[64][37]. Another aspect of the complexity of transcription regulation is the lack of consensus on how TFs and their cofactors at distal enhancers regulate transcription near the TSS ^[65][66]. The textbook model of chromatin looping posits that a stable contact is formed between distal enhancers and promoters. A variation of this classical model is the “kiss-and-run” model of transient contact between distal enhancers and promoters. However, the nature of such long-range genomic interactions and its relevance to transcription have recently been questioned ^[65][66]. The alternative TAG (TF activity gradient) model ^[65][66] emphasizes contact-independent “communication by diffusion” of TFs and their cofactors between enhancers and promoters. However, given the diversity of interacting TFs and their multitudes of cofactors, it would be difficult, if not impossible, to dissect the specific contribution of an individual TF and/or cofactor to the transcription output. On the other hand, Pol II recruitment and Pol II pausing represent the final regulatory outcomes by a plethora of TFs and cofactors. Those regulatory steps are directly related to the final transcription output. Obtaining information about them is thus critical for understanding the logic of transcription regulation. Surprisingly, such information is lacking in circadian rhythm research (Figure 1).

Figure 1. While clock proteins collaborate with other TFs at distal enhancers to regulate rhythmic transcription of target genes, exactly how final transcription output is determined by Pol II recruitment, premature termination, and pause release activities near the TSS is still an open question (?) and needs to be systematically characterized.

Against this backdrop, rwesearchers performed a ChIP-seq study of the Tbp (TATA-binding protein. A TFII D subunit) and Pol II during daily transcription in mouse liver ^[66][38]. ResWearchers used the Tbp to measure the Pol II recruitment at the gene promoter and assumed that the Tbp signal near the TSS is proportional to the rate of Pol II initiation (and reinitiation). However, the Tbp and the mediator remain promoter-bound during PIC formation and Pol II initiation and reinitiation ^[67][68][67,68], while other GTFs such as TFII B dissociate after Pol II initiation and recycle for Pol II binding during reinitiation ^[69][70][69,70]. Thus, relative to the signals of other GTFs, the Tbp signal might overestimate the Pol II initiation and reinitiation rates. Nonetheless, the Tbp and other GTFs appear to exhibit concordant changes near the TSS ^[71][72][71,72], permitting ouresearchers' use of the Tbp signal to measure not only Pol II recruitment, but also initiation and reinitiation [73]. The Tbp signals within the TSS region (defined as −50 to +300 bp to TSS ^[66][74][38,74]) were quantitated to measure Pol II recruitment ([Tbp]_TSS). The Pol II ([Pol II]_TSS) signals within the TSS region were quantitated to measure the paused Pol II, while the Pol II signals in the gene body ([Pol II]_GB) were quantitated to measure the gene transcription rate. The Pol II traveling ratios (TR: [Pol II]_TSS:[Pol II]_GB), which are quantitative measures of Pol II pausing ^[74][75][74,75], were also calculated. By means of the systematic characterization of Pol II recruitment and pausing for 7414 genes during daily transcription, theour study provides the first glimpse of their genome-wide characteristics.