Chromatin, a macromolecular complex of DNA, RNA, and proteins, provides a framework for the packaging of genetic material within the cell nucleus. Its organization plays a crucial role in gene expression and is regulated by a diverse array of protein complexes in response to a dynamic code of histone posttranslational modifications and DNA modifications. Architectural proteins are essential epigenetic regulators that play a critical role in organizing chromatin and controlling gene expression. CTCF (CCCTC-binding factor) is a key architectural protein responsible for maintaining the intricate 3D structure of chromatin. Because of its multivalent properties and plasticity to bind various sequences, CTCF is similar to a Swiss knife for genome organization.
1. Introduction
Originally, CTCF was described in chickens as a protein that binds to a region upstream of the c-myc promoter. Because that binding site has three regularly spaced repetitions of the sequence CCCTC, the protein was named CCCTC-binding factor or CTCF
[1]. Later, it was found that CTCF is a ubiquitously expressed and highly conserved protein in vertebrates
[2][3]. CTCF consists of 727 amino acids (aa) distributed in three domains; a zinc finger DNA-binding domain flanked by the intrinsically disordered N- and C-terminal regions (
Figure 1a). The DNA binding domain of CTCF has 11 zinc fingers (ZF) which allow it to interact dynamically with the DNA
[4][5][6]. CTCF uses different combinations of its ZF to recognize and bind to a variety of DNA sequences, which is why it is considered a multivalent protein
[7][8]. However, around 80% of its target sequences contain the core motif 5’-CCACCAGGTGG-3’ that is recognized by ZFs 4 to 7. Unconserved flanking sequences can be recognized by ZF 1–2 or ZF 8–11, which helps to stabilize the CTCF-DNA complex
[9][10][11]. A peculiarity of CTCF is that ZF1 and ZF10 have an RNA binding domain (RBD) which is used to interact with several lncRNAs, providing extra anchorage points for the protein
[10][12].
CTCF has tens of thousands of genomic binding sites, some of which are conserved between species and tissues
[13]. CTCF actions are dependent on its binding site location; which are mainly located in intergenic regions, although they could also be present in regulatory regions such as enhancers, gene promoters, and within gene bodies
[14][15][16]. The main functions of CTCF include maintaining topologically associated domains (TADs), acting as a barrier to the spread of heterochromatic structures, and defining the boundaries between euchromatin and heterochromatin, for this reason, CTCF has been coined as an architectural protein
[17][18][19][20]. CTCF also regulates DNA anchorage to cellular structures such as the nuclear lamina
[17][18], acts as a protein insulator by controlling the interactions between enhancers and promoters
[21], and can function as a scaffold protein for transcription factors
[22][23][24] and epigenetic factors
[25]. Based on the location of the CTCF in other genomic sites, it has also been demonstrated to be involved in processes such as alternative splicing by pausing RNA Polymerase II (RNAP II) binding to alternative exons, thus providing the required temporal context for co-transcriptional spliceosome formation at weak upstream splice sites
[26]. CTCF also interacts with lncRNAs which is important for the transcriptional regulation of genes such as Xist, a lncRNA responsible for X chromosome inactivation. For this reason, CTCF has been considered a very versatile protein similar to a swiss army knife. A summary of its functions is shown in
Figure 1b.
Figure 1. The architectonic factor CTCF. (
a) CTCF is an 82-kDa protein that contains three domains: an N-terminal region, a C-terminal region, and a central domain of 11 zinc fingers. Moreover, CTCF uses the zinc finger domain cooperatively to bind to DNA. RBD:RNA binding domain, ZF: zinc finger. (
b) Overview of the wide arrange of CTCF mechanisms of action as: Chromatin looping, RNA Polymerase II (Pol II) recruitment, transcriptional regulation, boundary definition, DNA anchorage, insulator, alternative splicing, and RNA binding, among others. TAD: topologically associated domain. Created with
BioRender.com (accessed on 23 April 2023).
2. CTCF Regulates the Chromatin Structure through Interactions with Several Epigenetic Factors
The chromatin status is dynamic and can be regulated by covalent modification of the amino-terminal ends of histones that protrude from the nucleosome and are accessible to enzymes that chemically modify them through a system of writing, reading, and erasing complexes
[27]. These modifications correspond to a kind of code that works in conjunction with the DNA sequence to determine the state of the chromatin and establishes and stabilizes gene expression patterns
[28]. Because of CTCF’s role as the master regulator of chromatin, it is highly probable that both its actions and DNA recruitment are dependent on the chromatin context. To better understand the interactions between CTCF and other proteins with epigenetic functions, researchers analyzed data from the literature, as well as the STRING database
[29] and the Integrated Interactions Database
[30] to find CTCF protein partners. While many of these partners are transcription factors that use CTCF as a scaffold to shape the chromatin structure
[31], CTCF also interacts with other proteins that have epigenetic functions, such as DNA and histone demethylases
[32][33][34]. The identification of CTCF protein partners involved in epigenetic processes may provide valuable insights into the complex regulatory mechanisms of chromatin organization and gene expression. To identify these proteins, researchers filtered our list of CTCF protein partners using the annotations available in the EpiFactors database
[35]. The resulting CTCF epigenetic factor targets are shown in
Figure 2.
Figure 2. Epigenetic factors that interact with CTCF. The protein–protein interactions between CTCF and other proteins with epigenetic functions. Colors are according to the EpiFactor category that each protein belongs to, as follows: histone modification reader in yellow, chromatin remodeling in mint, polycomb group proteins in blue navy, DNA demethylation in pink, histone modification eraser in salmon, RNA modification in green, and histone modification writing in orange. Created with
BioRender.com (accessed on 23 April 2023).
Among these interactions, many of the proteins participate in the shaping of the 3D conformation of the genome such as the DNA helicases CHD7
[36], CHD8
[37] and CHD1L
[38], the topoisomerases TOP2A
[39] and TOP2B
[40], and the components of chromatin remodeling complexes such as ARID1A
[41], YY1
[42], YAF2
[22] and BPTF
[31]. The former suggests that CTCF works in combination with other remodeling cofactors to establish chromatin domains.
It is also worth noticing that CTCF interacts with several members of the Polycomb group (PcG). These proteins are part of a system that regulates post-translational modifiers of histones, and their action is generally associated with the transcriptional repression of tissue-specific genes. This group has two members, the Polycomb Repressive Complexes 1 and 2 (PRC1 and PRC2). PRC2 is the complex that acts as a writer, as it is responsible for mono-, di-, and trimethylated lysine 27 of histone 3 (H3K27me3). This mark is associated with silenced gene promoters and facultative heterochromatin. H3K27me3 is recognized by PRC1 (reader) that binds to chromatin, monoubiquitinates lysine 119 of histone H2A (H2AK119ub), and prevents transcription by blocking the recruitment of RNA polymerase II
[43][44]. CTCF interacts with EED and SUZ12 which are members of the PRC2 complex; a couple of studies have proposed that CTCF could guide the PRC2 complex to gene promoters that are susceptible to repression through H3K27 methylation
[45][46]. Furthermore, BMI1, PCGF1, and RYBP are members of the PRC1 complex. Although the biological significance of their interaction with CTCF remains unexplored, a study shows that these proteins may regulate the organization of CTCF-mediated chromatin interactions
[47].
This entry is adapted from the peer-reviewed paper 10.3390/cells12101357