Molecular Phase Separation in Chromosomes: Comparison
Please note this is a comparison between Version 1 by Je-Kyung Ryu and Version 2 by Camila Xu.

Biomolecular phase separation denotes the demixing of a specific set of intracellular components without membrane encapsulation.

  • biomolecular phase separation
  • bridging-induced phase separation
  • intrinsically disordered proteins
  • multivalent DNA-binding proteins
  • stickers-and-spacers framework
  • compartments
  • cohesin
  • chromosomes

1. Introduction

The various components of cells (especially eukaryotic cells) are organized both spatially and temporally for efficient functioning; membrane-bound organelles are examples of spatiotemporal compartmentalization. However, other types of organelles exist that lack a membrane structure, known as membraneless organelles [1], and include: nucleoli for ribosomal synthesis in the nucleus [2], centrosomes for microtubule nucleation [3], Cajal bodies for the synthesis of spliceosomes [4], and stress granules for modulation of the stress response [5]. Although these organelles do not enclose their components within a membrane, they do not simply mix with their surroundings. Recent studies have found that demixing occurs spontaneously via liquid-liquid phase separation (LLPS) [6][7][8][9][10][6,7,8,9,10], a phenomenon known in physics and chemistry for more than a century. Demixing behavior occurs in a multi-component system when the energy gain for demixing is greater than the entropic loss for demixing. A good example is a typical water-oil system; water-oil mixing results in the formation of unfavorable water-oil molecular interactions, which exceeds the entropic penalty of demixing. Hence, such a system favors demixing under ambient conditions.

In 2009, Brangwynne and colleagues published a pioneering study in this field [11], which showed the liquid-like properties of P granules, a type of membraneless organelle in C. elegans . P granules exchange their components with the cytoplasm and exhibit fusion, dripping, and wetting behaviors. The authors also estimated the viscosity and surface tension of the granules. Subsequently, the material properties and biological implications of membraneless organelles have attracted significant interest [12][13][12,13]; a membraneless organelle can recruit specific molecules, whose local concentration becomes significantly higher than the cytosol concentration. As the concentration determines the reaction rate, the membraneless organelle can serve as a reaction center of the recruited molecules. In addition, because of their liquid-like nature, membraneless organelles allow the rapid arrangement of specific molecules upon perturbations such as temperature change; cells can use this mechanism to respond rapidly to an abrupt change of the environment. LLPS is involved in various biological processes, such as immune signaling [14], miRISC assembly [15], autophagy [16], nucleolus formation [17], stress granule assembly [18], transcriptional condensate assembly [19], and cohesin cluster formation [20].

It has also been suggested that phase separation drives chromosome organization and various genome-related biological functions [21][22][21,22]. DNA, which carries the genetic information of a cell, is densely packed in the nucleus. The efficient packing of DNA from a stretched, meters-long chain into a micrometer-scale structure is accomplished by chromatin, which is a molecular complex of DNA, protein, and RNA. Chromatin can be divided into two compartments, A and B, according to the gene content and location, and chromatin compartmentalization is believed to be driven by phase separation [23][24][23,24]. In addition, membraneless condensates form inside the nucleus, called nuclear condensates or nuclear bodies [25], whose formation and regulation can be explained by LLPS [22] ( Figure 1 ). In this review, we highlight recent advances in the contemporary understanding of phase separation in the nucleus, where phase separation involves the extremely long heteropolymer, DNA, for chromosome organization, and DNA-related biological functions.

Figure 1. Biomolecular condensates in the nucleus: A and B compartments, nucleolus, paraspeckles, and transcriptional condensates. Chromosomes are largely segregated via phase separation into two compartments: euchromatin (A, red) and heterochromatin (B, blue). Phase separation is also involved in the formation and regulation of membraneless organelles such as the nucleolus (gray), transcription condensates (magenta), and paraspeckles (green) in the nucleus.

2. Principles of Phase Separation

Consider two types of molecules, X and Y, in a test tube. If homotypic interactions (X-X and Y-Y) are more favorable than heterotypic interactions (X-Y), the system energetically prefers the two components to separate (phase separation). Meanwhile, entropy always drives the system towards mixing. Hence, there is a “tug of war” between the two driving forces, energy and entropy, and the molecular details determine whether phase separation occurs under the given experimental conditions (temperature, concentration, salt condition, etc.). A phase diagram is utilized to summarize the conditions of phase separation for the system of interest ( Figure 2 ).

Figure 2. Phase diagrams of prototypical two-component systems. Phase diagrams for (A) the monomer-monomer system and (B) the polymer-monomer system. Blue and green dots represent different types of unit molecules. The x-axis indicates the concentration of unit molecules of the blue species, and the y-axis indicates the system temperature. In panel B, the valence of a multimer, M, is set to three. Multimerization results in the expansion of the two-phase regime. (C) Anatomy of a phase diagram (see text for the definitions of different concentrations). The x-axis shows the multimer concentration, and has a different scale from panels A and B. The multimer concentration, however, is proportional to the unit molecule concentration, and the two can be interchangeably used.

At temperatures below the critical temperature T c (above which entropy disrupts phase separation), three different transition concentrations can be designated on the phase diagram (see Figure 2 C). As the multimer concentration increases, the saturation concentration ( c sat ) is reached in the system, after which the two phases are separated. Subsequently, the percolation concentration ( c perc ) is reached, which divides unnetworked and networked systems. Because the spatial proximity of multimers is driven by bond formation, the percolation concentration is coupled to the saturation concentration [26][27][34,35]. Finally, at the droplet concentration ( c drop ), the system re-enters the one-phase region.

Proteins are the essential driver of biomolecular phase separation, and their roles and mode of action in LLPS have been extensively studied. In this section, we discuss a simple conceptual framework that can explain the phase behaviors of proteins. The framework is useful in understanding biomolecular LLPS and can be extended further to other multimer systems. Two representative types of protein are known to undergo phase separation. Multi-domain proteins possess well-defined folded domains connected by disordered linkers. Several multi-domain protein systems have been reported to exhibit phase separation behavior [28][29][30][36,37,38]. A more prominent group is comprised of intrinsically disordered proteins (IDPs), which lack well-defined three-dimensional structures, even under physiological conditions [31][39]. Many phase separation systems identified in vivo contain significant portions of intrinsically disordered regions (IDRs) [32][40]. DNA and RNA, an important group of biomolecules in living cells, can also participate in intracellular phase separation [33][34][35][41,42,43].

Biomolecular condensates consist of hundreds or thousands of different types of biomolecules. Do they all contribute to the formation of condensates, or is there a subset of essential players in condensate formation? The latter seems to be the case in most systems, and the essential drivers are termed scaffolds . Typically, scaffolds are defined as molecules that can form droplets when isolated in vitro (to be rigorous, the removal of scaffold molecules from in vivo condensates must be shown to interrupt phase separation). The other molecules are recruited to condensates by their interactions with the scaffolds and are termed clients [36][50]. Although clients are not necessary for the formation of condensates, they can modulate the properties of condensates [37][51]. Recruitment leads to the non-uniform distribution of client molecules inside the condensates, as they tend to remain around the scaffolds [38][52].

3. Phase Separation in a Nucleus

Interphase chromosomes are segregated into two distinct compartments. The transcriptionally active, gene-rich form of chromatin is called euchromatin , and the transcriptionally inactive form is called heterochromatin ( Figure 1 , red and blue denoting euchromatin and heterochromatin, respectively) [39][40][41][42][43][44][56,57,58,59,60,61]. Compartmentalization seems to be driven by the phase separation of some proteins, such as heterochromatin protein 1 alpha (HP1α), a protein enriched in heterochromatin. Recent studies have shown that HP1α induces liquid droplet formation, and droplet formation tightly compacts DNA, supporting a role for the phase separation of HP1α in chromosome organization [23][24][23,24].

The nucleolus is an example of the scaffold-client model. Among hundreds of different biomolecules within a nucleolus [45][73], only a few proteins correspond to the formation of droplets as well as layered structures. Fibrillarin (FBL) is a protein that participates in the processing of ribosomal RNA and is enriched in DFC. Nucleophosmin (NPM1) is a protein associated with nucleolar ribonucleoprotein structures and is abundant in GC. A mixture of FBL and NPM1 was shown to reproduce phase separation in vitro and generate two-layer droplets, similar to the DFC-GC structure [46][74].

The structural features of typical TFs can explain how TFs induce LLPS. Typical TFs possess IDRs that can weakly interact with those of cofactors, and these multivalent interactions can induce dynamic assembly formation and be controlled by post-translational modification. Generally, TFs have stable structured domains for selective DNA/RNA binding, which provide additional weak interactions [47][79]. For example, FUS, EWSR1, and TAF15, known as the FET family, are mostly disordered and capable of binding to RNA molecules [48][80]. These are well-known model systems for phase separation in vitro [49][50][81,82]. The TFs interact with the intrinsically disordered C-terminal domain of Pol II, and this C-terminal domain is key to the formation of large spherical droplets, which possess a liquid property in living cells [51][83] even at endogenous expression levels [19][52][19,84].

Like phase separation of eukaryotic nuclear proteins and prokaryotic nucleoid proteins, phase separation of viral proteins is involved in the cellular processes of virus [53][54][55][85,86,87]. For example, RNA viruses, such as respiratory syncytial virus (RSV), vesicular stomatitis virus (VSV), and coronaviruses, appear to replicate themselves in viral inclusion bodies , membraneless condensates formed by phase separation, in host cells [53][54][56][57][58][85,86,88,89,90]. Moreover, several studies on coronaviruses have shown that the assembly of viral capsids and genomes occurs in dynamic cytoplasmic foci formed by phase separation [59][60][91,92], suggesting that phase separation plays a role in the replication and packaging of coronaviruses. Coronaviruses contain a relatively long 30 kbp single-stranded RNA genome and are compacted in a viral particle in a highly specific manner by excluding host RNA and many subgenomic RNAs [61][93]. In particular, the nucleocapsid protein (N-protein) of SARS-CoV-2 drives viral RNA genome packaging using LLPS, which is mediated by interactions between specific viral RNA sequences and multivalent RNA-binding domains and IDRs of the viral proteins [55][62][63][64][65][66][67][87,94,95,96,97,98,99]. Some specific RNA sequences interact with the N-proteins for LLPS, and this seems to ensure that the viral RNA is not entangled with other long cellular RNA molecules [68][69][100,101]. LLPS studies on viruses provide novel perspectives on how the composition of RNA determines its packaging into a small viral particle.

4. Local Phase Separation Models: BIPS and SIPS

Although a protein in the BIPS model is involved in multiple DNA interactions, it does not require multiple protein-protein interactions, which are the main driving forces of SIPS. Thus, BIPS does not require an IDR of a scaffold protein, which typically provides multivalency and flexibility because flexible and long DNA can provide multiple binding sites for multivalent DNA-binding proteins. Moreover, while DNA organization is strongly coupled to DNA-protein cluster formation in BIPS, the organization of DNA can be completely independent of phase separation in SIPS ( Figure 35 D). Although the molecular mechanisms differ, BIPS shares many similarities with SIPS. For example, condensates formed by BIPS can have liquidity [20]. Hence, the techniques used to study SIPS can be applied to analyze BIPS.

Figure 3. BIPS versus SIPS. (A) Cartoon of a multivalent DNA-binding protein that has at least two DNA-binding sites. DNA-binding sites of the protein are depicted as orange circles, and the protein is denoted as a blue circle. (B) Schematic of the BIPS model. Two DNA-binding sites per protein are sufficient for condensation, and a long DNA molecule is irreplaceable in this mechanism. (C) Cartoon of a multivalent protein-binding protein that induces typical phase separation. Yellow circles on the protein (blue circle) depict protein binding sites. (D) Typical phase separation mechanism (SIPS), which uses multivalent protein-protein interactions. At least three binding sites are necessary for phase separation, and DNA plays an auxiliary role in this process. (E,F) Dependence of the protein-DNA cluster size on the length of DNA shown in the previous study of cohesin-mediated BIPS [20]. (E) Cartoons of possible protein-DNA complex topologies for a range of DNA lengths and (F) a plot showing cluster size versus DNA length [20]. With <3 kbp of DNA, a single protein binds to DNA with no cooperativity (blue line). With ~3 kbp DNA, multivalent DNA-binding proteins can bridge a DNA to form a loop. For longer DNA (>3 kbp), a larger cluster can be formed, and the cluster size scales as a power law with the DNA length (red line).

The cohesin-SMC complex is important for interphase chromosome organization [70][71][157,158], and in-vitro experiments have shown that the complex forms condensates via the BIPS mechanism [20]. Cohesin is a good model for a protein with multiple DNA-binding sites. Because it acts primarily as a motor protein to extrude a DNA loop for interphase chromosome organization, there are at least two DNA-binding sites on the surface of the cohesin protein for the relative motion of two different DNA-binding sites in an ATP hydrolysis-dependent manner. Multiple DNA-binding sites on the cohesin protein have been confirmed by various structural studies, suggesting that it can bridge distant DNA segments [71][72][73][74][158,159,160,161].

The cohesin-SMC complex has a non-monotonic size dependence on DNA length, and the cohesin-dependent BIPS mechanism can successfully explain the behavior by considering DNA bridging activity ( Figure 35 E,F) [20]. In an experiment, the DNA length was varied from 100 bp to 50 kbp, while the DNA concentration was fixed. The DNA-cohesin mixture was incubated and imaged using an AFM. For short DNA lengths ( l < 3 kbp), no clear cohesin-DNA cluster was formed; however, beyond a crossover point of l c ~ 3 kbp, the cluster size increased rapidly with DNA length, scaling as a power law ( Figure 35 F). The crossover point can be explained quantitatively by considering the free energy cost related to DNA looping by the bridging of a cohesin protein. When a single cohesin complex bridges two DNA sites to form a loop , the free energy change can be roughly estimated based on two contributions: (1) the DNA bending energy; and (2) the entropic cost due to DNA looping. The optimal length for DNA looping can be obtained by minimizing the following free energy: (8) F k B T = 2ε l p l + 1.5 log ( l l p ) where l is the loop size when DNA is bridged by a single cohesin protein complex, l p = 50 nm is the persistence length of DNA, and ε = 16 is the shape parameter based on a tear drop [75][162]. The free energy is numerically minimized around the DNA length of 3 kbp, and hence, DNA must be at least 3 kbp to be bridged. A longer DNA construct (>3 kbp) provides a nucleation point for further growth of the condensates, which catalyzes cluster growth. The power-law scaling behavior of cluster size with DNA length was reproduced by computer simulations, which modeled cohesin as a patchy particle with two distinct DNA-binding sites [20].

Although BIPS and SIPS seem to be opposing concepts, they can work together to induce efficient phase separation. As discussed, in the BIPS model, a bridged loop can act as a nucleation point ( Figure 35 B). The loop can attract multivalent proteins involved in SIPS ( Figure 35 C), resulting in the interplay between BIPS and SIPS. It is probable that some topologically associating domains (TADs), observed via Hi-C analysis [76][163], might be formed by BIPS, since an extruded DNA loop at the convergent CCCTC-binding factor (CTCF)-binding sites can act as a nucleation point for the growth of multivalent DNA-binding proteins assemblies. If this model is correct, interactions between a DNA loop and other nuclear condensates, such as transcriptional condensates, would be observed.