Phase Separation of Intrinsically Disordered Nucleolar Proteins

Phase Separation of Intrinsically Disordered Nucleolar Proteins: Comparison

Please note this is a comparison between Version 1 by Francisco Guillén-Chable and Version 2 by Lindsay Dong.

The process of phase separation allows for the establishment and formation of subcompartmentalized structures, thus enabling cells to perform simultaneous processes with precise organization and low energy requirements. Chemical modifications of proteins, RNA, and lipids alter the molecular environment facilitating enzymatic reactions at higher concentrations in particular regions of the cell.

nucleolus
intrinsically disordered
phase separation

1. Introduction

Cellular compartmentalization through lipid membranes permits specific cellular processes to take place at distinct regions inside the cell [1]. However, highly complex cellular systems require a rapid interchange of reaction components of biochemical processes with low energy consumption. A growing body of evidence shows that membraneless organelles display highly ordered structures and possess functional features similar to membrane-bound organelles ^[2][3][4][2,3,4]. Cellular membraneless organelles are involved in fundamental and dynamic processes, such as ribosome biogenesis, transcription, and RNA processing, and their behavior is receiving more attention due to their molecular principles [5]. Membraneless organelles display a liquid-like behavior, a characteristic of liquid–liquid phase separation (LLPS) ^[4][6][7][4,6,7]. Membraneless organelles are enriched in proteins, RNA molecules, and lipids ^[5][6][5,6]. These molecules, especially proteins, undergo regulation that can bear several modifications, which in turn modulates their propensity to interact with other components inside them. RNA constituents are important regulators of the formation and disassembly of membraneless bodies ^[7][8][7,8]. Maintenance of membraneless organelles (also called biocondensates) through the cell cycle remains elusive, and many questions arise about the nature of LLPS that takes place in a nonequilibrium system, e.g., a cell. The nucleolus provides a well-documented example of an organelle generated by phase separation [9]. Described as a factory of ribosomal RNA and ribonucleoparticles (RNPs), the nucleolus has to deal with a crowded microenvironment to accomplish these vital processes ^[10][11][10,11]. However, another large number of activities has been described for the nucleolus: stress response [12], cell cycle [13], genome integrity and stability ^[14][15][16][14,15,16], and RNP biogenesis [17]. All these events converge in a non-bounded membrane organelle, suggesting strict control of the molecular events driving each process. Moreover, how well-organized functional structures are formed to carry all nucleolar functions with a minimum energy requirement is still elusive.

2. Basic Thermodynamics of LLPS Driving Nucleolus Assembly

The process of LLPS is an entropy-driven event and, therefore, described by the thermodynamic law of free energy state ^[18][19][20,21]. Entropy-driven events are defined as spontaneous processes taking place without consuming energy. As stated by the second law of thermodynamics, spontaneous reactions occurring inside the cell take place by lowering the free energy state in pressure and temperature constants. The formation of liquid-like droplets is a passive process that involves weak intra- and inter-molecular interactions between RNA, proteins, and lipids within a dynamic scaffold ^{[2][3][20][21][22]}[2,3,22,23,24]. LLPS-assembled biocondensates typically display an almost spherical shape, propensity for fusion and/or fission, and a high degree of internal molecule dynamics ^[23][24][25,26]. Moreover, LLPS is required for processes related to rRNA transcription [11], mRNA metabolism ^[25][27], posttranslational modifications (PTM) of proteins, and signaling. For example, nucleolar structure formation by coalescence assembly involves two major driving forces: a passive thermodynamically-dependent process and an active energy-consuming process ^[26][27][28,29]. The two specific features of the droplet formation model by coalescence are temperature-dependence and reversibility. LLPS is a spontaneous mechanism that builds up the nucleolar region and possibly mediates the functional formation of a nucleolus by spinodal decomposition primarily in the free Gibbs energy. The process of spinodal decomposition involves a thermodynamically unstable solution that can transform within a miscibility gap to a mixture of two phases, since it occurs in a thermodynamically unstable state. The spinodal region of the phase diagram for each compound is where the free energy can be lowered by allowing the components to separate. Therefore, this results in an increased relative concentration of a component material (RNA, protein, or lipid) in a particular region of the cell. The concentration will continue to increase, thus, forming a particular structure (nucleoli, Cajal body, etc.). Once nucleation begins, a structure increases in size. To maintain the dynamics of the molecular characteristics of proteins, RNA and lipids must change by chemical modifications. Phosphorylation and methylation are some of the most common chemical alterations of these three types of molecules.

Very large regions of material will change their concentration slowly due to the amount of material that must move following passive processes, from high to low gradient concentration, isoelectric point, and electrostatic gradients. Very small regions composed of just a few molecules will shrink away due to the energy cost in entropy to maintain an interface between two dissimilar component materials ^[28][29][30][30,31,32]. It does not require nucleation events to form as the phases evolve continuously, thus, creating a separation of molecules without the aid of chemical energy. However, it is interesting to note that not only does LLPS mediate the formation of the nucleolus, but energy-consuming processes are also involved. Therefore, the enzymatic process is the second force that contributes to the formation of a functional nucleolus [17]. LLPS is dependent on the multivalent crosslinking interactions of molecules, such as RNA and proteins, that possess unstructured metastable regions or intrinsically disordered regions (IDR) ^[8][31][8,33]. These structural features give rise to typical properties of membraneless biocondensates, such as viscoelasticity and the fusion dependent on temperature ^[28][29][30][30,31,32]. Defining the energy required in the unmixing binary or ternary mixtures is not the only phase occurrence where energy-free growth occurs. Consider an example of two independent protein–lipid complexes, which undergo a particular order structure in thermal equilibrium. The phase transition forms a disordered state, where the two molecule species A and B are randomly distributed in the nucleolus forming an ordered arrangement depending on their concentration, thus creating a particular phase as described earlier by theoretical spinodal order, ^[32][34]. Phase separation of proteins that interact with nuclear lipids appear to form a pattern structure where each molecule occupies a defined position; therefore, the diffusion would follow a similar pattern as a spinodal decomposition.

However, nucleation also plays a role, under different concentrations for some of the compounds. In addition to the complexity of the mixture of materials, we have to consider the additional complexity of molecule modification by phosphorylation, acetylation, methylation, etc. ^[33][35]. All this alteration may allow a particular molecule to change its physicochemical characteristics, and since it would be at a different concentration, it may change position as it moves from a spinodal to a binodal or outside the range of phase separation. Thus, a molecule can transit from a particular structure to another without the addition of chemical energy. Figure 1 shows the miscibility gap where the initial free energy G would allow for a particular molecule of C₁ concentration to phase separate from solution. The unstable zone where the spinodal decomposition lies is where the tangent of the curve from the Gibbs free energy changes so that the ∂²G/∂ Φ ² = 0. Therefore, molecules can phase separate freely in this region and form particular structures without additional energy. In the cell nucleus, there is an added complexity of molecules, such as RNA, that may form nucleation events that can influence the range of concentrations for phase separation. Moreover, some of the molecules may have “fixed” positions due to interactions with specific DNA sequences. Nucleation requires additional energy, as stated in ∂²∆G/∂ φ ² > 0. This may range from lncRNA to DNA binding proteins to set the particular locations for nucleation to take place. LncRNA may provide an additional dynamic situation. Previous studies have shown that RNA can alter the concentration required for phase separation for particular proteins ^[34][35][36,37]. Furthermore, the addition of RNA inhibitors, such as actinomycin D or DRB, result in nucleolar alteration and form different phase separations, forming the well-known nucleolar CAP ^[36][37][38,39]. Therefore, the range of concentrations in which a molecule can freely form a particular phase may be wider in the presence of RNA than if the compounds were taken in an in vitro scenario. These dynamics may prove relevant for nuclear organization, in particular, during the cell cycle as well as during temperature stress, where particular proteins, such as fibrillarin, that participate in the nucleolar process may redistribute to the cytoplasm ^[38][40]. N

Figure 1. Schematic Gibbs free energy graph and the corresponding phase diagram. (A) Gibbs free energy graph shows the binodal and spinodal point where ∂²∆G/∂φ² can be greater than zero or equal to zero, respectively; this illustrates the metastable regions before the spinodal points where instability takes place, followed by stable gel or precipitate formation. (B) The coexistence line (RED) separates the one-phase and two-phase regimes and is a function of environmental conditions, such as temperature and pH. The system does not undergo phase separation beyond the critical point marked as a blue dot. At low concentrations, the system is in the one-phase solution. Passing into the yellow zone, we have the binodal section where nucleation can take place; this is higher still in concentration between the spinodal points, where ∂²∆G/∂φ² = 0 indicates the area region of instability in which the system must undergo demixing via spinodal decomposition.

3. The Membraneless Nucleolar Compartment

The nucleolus is a prototypical membraneless compartment inside eukaryotic cells and the location of several important processes during the cell cycle. The nucleolus in conjunction with other subnuclear structures, such as nuclear speckles, paraspeckles, PML bodies, lipid islets, or Cajal bodies, drives crucial, specific, and tightly regulated processes of RNA metabolism and processing. The nucleolus, as the core biocondensate drives the synthesis, processing, and assembling of rRNA into ribosomal proteins (r proteins) to produce functional ribosomes ^{[10][39][40][41]}[10,43,44,45]. The nucleolus has a multilayer organization composed of three subcompartments: the fibrillar component (FC), the dense fibrillar component (DFC), and the granular component (GC) (see Figure 2).

4. Intrinsically Disordered Proteins Promote LLPS in the Nucleolus

4.1. Intrinsically Disordered Proteins and Intrisnically Disordered Regions

Nucleolar proteins often contain enzymatically active regions flanked by IDRs (see Figure 23). These intrinsically disordered regions in their amino acid sequences are coupled to a modular region with enzymatic activity, RNA-binding regions, or nucleolar localizing signals. Now, proteins are currently categorized in two main groups: (a) well-structured proteins and (b) proteins with non-conserved 3D, subdivided into fully disordered proteins and proteins with IDR. Nucleolus-residing proteins are well characterized by their contribution to sorting, modifying, and assembling rRNA and ribosome production. Often, nucleolar proteins contain stretches of IDRs in their sequences, highlighting that IDRs are important to drive the synthesis and sorting of rRNA transcripts.

Figure 2. Intrinsically disordered regions in some key protein components during rRNA metabolism drive the LLPS behavior of the nucleolus. (A) The intrinsically disordered tendency of the RNA Pol I machinery and the well-characterized snoRNP complexes driving pseudo-uridylation and methylation, snoRNP H/ACA, and snoRNP C/D Box, respectively. As calculated by IUPred algorithm (www.iupred2a.elte.hu, accessed on 11 March 2021), a tendency to disorder is predicted when values per residue exceed the 0.5 IUPred scores. Intrinsically disordered regions vary between complexes and between N- and C-terminal domains, for example, the major RPA1 factor of RNA Pol I exhibit a tendency to disorder in its C-terminal domain and the GAR1 protein founded in the H/ACA complexes. In contrast, fibrillarin from the C/D Box complex displays a well-documented disordered region in its N-terminal domain. (B) Diagram phases computed with the CIDER (Classification of Intrinsically Disordered Ensemble Regions; www.pappulab.wustl.edu/CIDER, accessed on 11 March 2021) algorithm showing the expected conformation that IDP could adopt ranging from globules and tadpoles, collapsed or expanded, coils and hairpins, and swollen coils. Although the disordering tendency shows a set of proteins containing IDR, each corresponding diagram phase indicates a small tendency to phase separate. As shown in the phase diagrams, the majority of the proteins involved in the complexes fall in the phase two plot region, as a collapsed and extended region. This corresponds to the idea that LPPS is coupled to other energy-consuming processes that in the last step promotes a multicomponent phase-separated nucleolus. Protein names were omitted in the plots to prevent overlapping.

4.2. Physiochemical Features and Sequences Signatures in IDR-Containing Proteins

The tunable physicochemical properties of key nucleolar proteins involved in several aspects of cell cycle and cell development had raised questions about how PTMs could explain how biocondensates are regulated in an active energy-consuming process. Several proteins related to rRNA transcription, processing, and nucleolus assembling possess low complexity domains often called intrinsically disordered domains, characterized by aromatic and charged amino acids. A well-documented signature is the RGG/RG repeat, characterized by the fact that this region is modified by methyltransferases, adding a methyl group to the terminal guanidine nitrogen atom of the arginine. Fibrillarin and nucleolin are highly expressed proteins in the nucleolus and contain stretches of RGG/RG repeats in their sequences ^[42][43][44][51,52,53].

4.3. Posttranslational Modifications Regulating IDR-Containing Proteins

However, how these RGG/RG signatures impact the fluidity and maintenance of subnucleolar organization is still obscure. Arginine methylations inhibit LLPS in vitro, but in vivo arginine methylation promotes LLPS by two mechanisms: (1) modifying protein–RNA interactions or (2) promoting partner interactions between proteins. Thus, in vivo, arginine methylations seem to promote LLPS rather than inhibit it. Phosphorylation of serine or tyrosine residues has been getting more attention for its relevance in LLPS ^[22][45][24,54]. Contrary to arginine methylation, phosphorylation and dephosphorylation of threonine and tyrosine residues can occur quickly and change the properties of biocondensates in a tunable manner. Further information about PMT as key regulators of LLPS can be found in the excellent review by ^[45][54].

4.4. Key Nucleolar Proteins Containing IDR

There is evidence regarding the influence of key nucleolar proteins residing in the nucleolus whose dysregulation could impact the LLPS maintenance of nucleolar structure; these proteins are fibrillarin, nucleolin, GAR1, and nucleophosmin, among others. Nucleolin is a highly conserved protein found in the nucleolus GC, composed of three domains: an N-terminal domain integrated by HMG-like domain, a central domain with two different stretches of RNA binding domains, and a C-terminal domain rich in arginine and glycine repeats similar to that found in fibrillarin ^[46][47][48][55,56,57]. Two main proteins residing at the DFC, fibrillarin, and GAR1 are the catalytic centers of two different complexes that guide the specific site methylation of ribosomal RNA (rRNA) nucleotide residues and the pseudo-uridylation of uridine residues in the rRNA, respectively ^[49][58]. These two proteins contain IDR at the N- and C- terminal regions, respectively ^[50][44][46,53]. The N-terminal IDR of fibrillarin is a glycine–arginine rich region (GAR-domain) containing several repetitions of [RG] boxes in the whole domain ^[51][59]. The fibrillarin GAR-domain functions as a signal that directs the protein to the nucleolus to produce ribosomes, and this region is also involved in the recognition of lipids, some specific RNA partners, and other proteins. Recently the GAR-domain of Arabidopsis thaliana fibrillarin was described as a modular domain with ribonuclease activity, adding another layer of complexity to this domain and the whole protein itself ^[52][60]. The GAR1 protein is the central catalytic unit of the H/ACA ribonucleoparticle complexes, mainly involved in the modification of specific residues on the rRNA, but it is not clear if the C- terminal IDR is playing an important role in the interaction with other partners and localization in the nucleolus ^[49][58]. Nucleophosmin (NPM1 also B23) is a highly conserved nucleolar phosphoprotein related to the last stages of ribosome biogenesis and located at the GC ^[53][61]. As other nucleolar proteins, NPM1 is a modular protein composed of several domains: an oligomerization domain (OD) and two IDRs located between the OD and the C-terminal domain, which possess DNA and RNA binding activity, binding G-quadruplex DNA sequences ^[53][54][61,62]. NPM1 functions are often related to the last steps in ribosome processing and assembly with r-proteins, but also several other functions of NPM1 were described: regulation of centrosome duplication, genome stability, stress response, apoptosis, cancer, and rRNA gene remodeling. Although NPM1 is important for rRNA assembly with r proteins, its function is indispensable for ribosome biogenesis. NPM1 depletion disrupts the nucleolar structure. Nucleolar localization of NMP1 depends on the formation of a pentamer oligomer, and the disruption of this by a single amino acid phosphorylation on the serine 48 located in the OD changes its localization to the nucleoplasm. Oligomerization relies on OD and on the adjacent acidic IDR ^[55][63]. Further, lipids are important in the LLPS system ^[8][38][56][8,40,64]. It remains unclear how lipids could be involved in the formation, nucleation, or maintenance of biocondensate bodies. Recently, the nuclear nanoscale localization of PI(4,5)P2, PI(3,4)P2, and PI(4)P was found by dSTORM. A subset of PI(4,5)P2 and PI(3,4)PI was detected near the RNA Pol II machinery, specifically on nuclear speckles, a biocondensate known to concentrate mRNA and proteins related to RNA modifying routes. Moreover, another population of PI(4,5)P2 was detected in the nucleolar DFC, surrounded by fibrillarin, displaying a ring-shaped structure ^[57][65]. Lipid islets have been related to the creation of nucleation points that concentrate key signal proteins in response to several types of stress ^[56][64]. RNPs are well conserved complexes in eukaryotic cells, mainly concentrated where modifications and the last processing events of the ribosome biogenesis take place. Classic examples correspond to the C/D and H/ACA Box RNPs, enrolled with methylation and pseudo-uridylation of specific residues on the recently transcribed rRNAs (see Figure 23) ^{[58][59][60][61]}[66,67,68,69]. The link between RNPs and their content of intrinsic disorder on their protein components has been demonstrated (Figure 23), for example, the GAR1 and fibrillarin proteins for H/ACA and C/D box RNPs ^[43][62][52,70]. Moreover, it has been demonstrated that the presence of IDR in the RNP complex drives LLPS events inside the cell ^[63][64][71,72]. Studying RNP biogenesis as well as its functional activity inside the cells could shed some light on the principles governing LLPS, because RNPs can form specific and focal volume of nucleation points that resemble the major LLPS biocondensates found in cells. A clear example is the nature of the RNP complex in which the uL3 expression status could also influence rRNA biogenesis in certain types of cancer ^[65][73].