Biomolecular phase separation denotes the demixing of a specific set of intracellular components without membrane encapsulation.
The various components of cells (especially eukaryotic cells) are organized both spatially and temporally for efficient functioning; membrane-bound organelles are examples of spatiotemporal compartmentalization. However, other types of organelles exist that lack a membrane structure, known as membraneless organelles , and include: nucleoli for ribosomal synthesis in the nucleus , centrosomes for microtubule nucleation , Cajal bodies for the synthesis of spliceosomes , and stress granules for modulation of the stress response . Although these organelles do not enclose their components within a membrane, they do not simply mix with their surroundings. Recent studies have found that demixing occurs spontaneously via liquid-liquid phase separation (LLPS) , a phenomenon known in physics and chemistry for more than a century. Demixing behavior occurs in a multi-component system when the energy gain for demixing is greater than the entropic loss for demixing. A good example is a typical water-oil system; water-oil mixing results in the formation of unfavorable water-oil molecular interactions, which exceeds the entropic penalty of demixing. Hence, such a system favors demixing under ambient conditions.
In 2009, Brangwynne and colleagues published a pioneering study in this field , which showed the liquid-like properties of P granules, a type of membraneless organelle in C. elegans . P granules exchange their components with the cytoplasm and exhibit fusion, dripping, and wetting behaviors. The authors also estimated the viscosity and surface tension of the granules. Subsequently, the material properties and biological implications of membraneless organelles have attracted significant interest ; a membraneless organelle can recruit specific molecules, whose local concentration becomes significantly higher than the cytosol concentration. As the concentration determines the reaction rate, the membraneless organelle can serve as a reaction center of the recruited molecules. In addition, because of their liquid-like nature, membraneless organelles allow the rapid arrangement of specific molecules upon perturbations such as temperature change; cells can use this mechanism to respond rapidly to an abrupt change of the environment. LLPS is involved in various biological processes, such as immune signaling , miRISC assembly , autophagy , nucleolus formation , stress granule assembly , transcriptional condensate assembly , and cohesin cluster formation .
It has also been suggested that phase separation drives chromosome organization and various genome-related biological functions . DNA, which carries the genetic information of a cell, is densely packed in the nucleus. The efficient packing of DNA from a stretched, meters-long chain into a micrometer-scale structure is accomplished by chromatin, which is a molecular complex of DNA, protein, and RNA. Chromatin can be divided into two compartments, A and B, according to the gene content and location, and chromatin compartmentalization is believed to be driven by phase separation . In addition, membraneless condensates form inside the nucleus, called nuclear condensates or nuclear bodies , whose formation and regulation can be explained by LLPS  (Figure 1).
Consider two types of molecules, X and Y, in a test tube. If homotypic interactions (X-X and Y-Y) are more favorable than heterotypic interactions (X-Y), the system energetically prefers the two components to separate (phase separation). Meanwhile, entropy always drives the system towards mixing. Hence, there is a “tug of war” between the two driving forces, energy and entropy, and the molecular details determine whether phase separation occurs under the given experimental conditions (temperature, concentration, salt condition, etc.). A phase diagram is utilized to summarize the conditions of phase separation for the system of interest (Figure 2).
At temperatures below the critical temperature T c (above which entropy disrupts phase separation), three different transition concentrations can be designated on the phase diagram (see Figure 2C). As the multimer concentration increases, the saturation concentration ( c sat ) is reached in the system, after which the two phases are separated. Subsequently, the percolation concentration ( c perc ) is reached, which divides unnetworked and networked systems. Because the spatial proximity of multimers is driven by bond formation, the percolation concentration is coupled to the saturation concentration . Finally, at the droplet concentration ( c drop ), the system re-enters the one-phase region.
Proteins are the essential driver of biomolecular phase separation, and their roles and mode of action in LLPS have been extensively studied. In this section, we discuss a simple conceptual framework that can explain the phase behaviors of proteins. The framework is useful in understanding biomolecular LLPS and can be extended further to other multimer systems. Two representative types of protein are known to undergo phase separation. Multi-domain proteins possess well-defined folded domains connected by disordered linkers. Several multi-domain protein systems have been reported to exhibit phase separation behavior . A more prominent group is comprised of intrinsically disordered proteins (IDPs), which lack well-defined three-dimensional structures, even under physiological conditions . Many phase separation systems identified in vivo contain significant portions of intrinsically disordered regions (IDRs) . DNA and RNA, an important group of biomolecules in living cells, can also participate in intracellular phase separation .
Biomolecular condensates consist of hundreds or thousands of different types of biomolecules. Do they all contribute to the formation of condensates, or is there a subset of essential players in condensate formation? The latter seems to be the case in most systems, and the essential drivers are termed scaffolds . Typically, scaffolds are defined as molecules that can form droplets when isolated in vitro (to be rigorous, the removal of scaffold molecules from in vivo condensates must be shown to interrupt phase separation). The other molecules are recruited to condensates by their interactions with the scaffolds and are termed clients . Although clients are not necessary for the formation of condensates, they can modulate the properties of condensates . Recruitment leads to the non-uniform distribution of client molecules inside the condensates, as they tend to remain around the scaffolds .
Interphase chromosomes are segregated into two distinct compartments. The transcriptionally active, gene-rich form of chromatin is called euchromatin , and the transcriptionally inactive form is called heterochromatin (Figure 1, red and blue denoting euchromatin and heterochromatin, respectively) . Compartmentalization seems to be driven by the phase separation of some proteins, such as heterochromatin protein 1 alpha (HP1α), a protein enriched in heterochromatin. Recent studies have shown that HP1α induces liquid droplet formation, and droplet formation tightly compacts DNA, supporting a role for the phase separation of HP1α in chromosome organization .
The nucleolus is an example of the scaffold-client model. Among hundreds of different biomolecules within a nucleolus , only a few proteins correspond to the formation of droplets as well as layered structures. Fibrillarin (FBL) is a protein that participates in the processing of ribosomal RNA and is enriched in DFC. Nucleophosmin (NPM1) is a protein associated with nucleolar ribonucleoprotein structures and is abundant in GC. A mixture of FBL and NPM1 was shown to reproduce phase separation in vitro and generate two-layer droplets, similar to the DFC-GC structure .
The structural features of typical TFs can explain how TFs induce LLPS. Typical TFs possess IDRs that can weakly interact with those of cofactors, and these multivalent interactions can induce dynamic assembly formation and be controlled by post-translational modification. Generally, TFs have stable structured domains for selective DNA/RNA binding, which provide additional weak interactions . For example, FUS, EWSR1, and TAF15, known as the FET family, are mostly disordered and capable of binding to RNA molecules . These are well-known model systems for phase separation in vitro . The TFs interact with the intrinsically disordered C-terminal domain of Pol II, and this C-terminal domain is key to the formation of large spherical droplets, which possess a liquid property in living cells  even at endogenous expression levels .
Like phase separation of eukaryotic nuclear proteins and prokaryotic nucleoid proteins, phase separation of viral proteins is involved in the cellular processes of virus . For example, RNA viruses, such as respiratory syncytial virus (RSV), vesicular stomatitis virus (VSV), and coronaviruses, appear to replicate themselves in viral inclusion bodies , membraneless condensates formed by phase separation, in host cells . Moreover, several studies on coronaviruses have shown that the assembly of viral capsids and genomes occurs in dynamic cytoplasmic foci formed by phase separation , suggesting that phase separation plays a role in the replication and packaging of coronaviruses. Coronaviruses contain a relatively long 30 kbp single-stranded RNA genome and are compacted in a viral particle in a highly specific manner by excluding host RNA and many subgenomic RNAs . In particular, the nucleocapsid protein (N-protein) of SARS-CoV-2 drives viral RNA genome packaging using LLPS, which is mediated by interactions between specific viral RNA sequences and multivalent RNA-binding domains and IDRs of the viral proteins . Some specific RNA sequences interact with the N-proteins for LLPS, and this seems to ensure that the viral RNA is not entangled with other long cellular RNA molecules . LLPS studies on viruses provide novel perspectives on how the composition of RNA determines its packaging into a small viral particle.
Although a protein in the BIPS model is involved in multiple DNA interactions, it does not require multiple protein-protein interactions, which are the main driving forces of SIPS. Thus, BIPS does not require an IDR of a scaffold protein, which typically provides multivalency and flexibility because flexible and long DNA can provide multiple binding sites for multivalent DNA-binding proteins. Moreover, while DNA organization is strongly coupled to DNA-protein cluster formation in BIPS, the organization of DNA can be completely independent of phase separation in SIPS (Figure 3D). Although the molecular mechanisms differ, BIPS shares many similarities with SIPS. For example, condensates formed by BIPS can have liquidity . Hence, the techniques used to study SIPS can be applied to analyze BIPS.
The cohesin-SMC complex is important for interphase chromosome organization , and in-vitro experiments have shown that the complex forms condensates via the BIPS mechanism . Cohesin is a good model for a protein with multiple DNA-binding sites. Because it acts primarily as a motor protein to extrude a DNA loop for interphase chromosome organization, there are at least two DNA-binding sites on the surface of the cohesin protein for the relative motion of two different DNA-binding sites in an ATP hydrolysis-dependent manner. Multiple DNA-binding sites on the cohesin protein have been confirmed by various structural studies, suggesting that it can bridge distant DNA segments .
The cohesin-SMC complex has a non-monotonic size dependence on DNA length, and the cohesin-dependent BIPS mechanism can successfully explain the behavior by considering DNA bridging activity (Figure 3E,F) . In an experiment, the DNA length was varied from 100 bp to 50 kbp, while the DNA concentration was fixed. The DNA-cohesin mixture was incubated and imaged using an AFM. For short DNA lengths ( l < 3 kbp), no clear cohesin-DNA cluster was formed; however, beyond a crossover point of l c ~ 3 kbp, the cluster size increased rapidly with DNA length, scaling as a power law (Figure 3F). The crossover point can be explained quantitatively by considering the free energy cost related to DNA looping by the bridging of a cohesin protein. When a single cohesin complex bridges two DNA sites to form a loop , the free energy change can be roughly estimated based on two contributions: (1) the DNA bending energy; and (2) the entropic cost due to DNA looping. The optimal length for DNA looping can be obtained by minimizing the following free energy: (8) F k B T = 2ε l p l + 1.5 log ( l l p ) where l is the loop size when DNA is bridged by a single cohesin protein complex, l p = 50 nm is the persistence length of DNA, and ε = 16 is the shape parameter based on a tear drop . The free energy is numerically minimized around the DNA length of 3 kbp, and hence, DNA must be at least 3 kbp to be bridged. A longer DNA construct (>3 kbp) provides a nucleation point for further growth of the condensates, which catalyzes cluster growth. The power-law scaling behavior of cluster size with DNA length was reproduced by computer simulations, which modeled cohesin as a patchy particle with two distinct DNA-binding sites .
Although BIPS and SIPS seem to be opposing concepts, they can work together to induce efficient phase separation. As discussed, in the BIPS model, a bridged loop can act as a nucleation point (Figure 3B). The loop can attract multivalent proteins involved in SIPS ( Figure 3C), resulting in the interplay between BIPS and SIPS. It is probable that some topologically associating domains (TADs), observed via Hi-C analysis , might be formed by BIPS, since an extruded DNA loop at the convergent CCCTC-binding factor (CTCF)-binding sites can act as a nucleation point for the growth of multivalent DNA-binding proteins assemblies. If this model is correct, interactions between a DNA loop and other nuclear condensates, such as transcriptional condensates, would be observed.