2. Molten Globule as an (Un)Folding Intermediate
Historically, protein denaturation and unfolding studies are based on the well-accepted and rather obvious (at least now) mantra stating, “Structure does exit since it can be broken”. These studies played crucial roles in establishing protein science in general and in understanding the basis of the correlation between protein amino acid sequence and function in particular. As early as in 1931, Hsien Wu (1893–1959) proposed the first theory of protein denaturation: the active structure is known to exist because it is destroyed by denaturation
[40][41]. His paper published in the
Chinese Journal of Physiology contained the first statement that protein function depends on prior structure
[40][41]. However, even earlier, in 1925, Mortimer Louis Anson (1901–1968) and Alfred Ezra Mirsky (1900–1974) showed that intact hemoglobin can exist as such near the neutral point only, whereas dilute acid or alkali changed it to the denatured form, which could fold back to its native state upon restoration of native conditions, indicating that protein denaturation and unfolding are reversible processes
[42].
In 1936, the first Western review on protein denaturation that represents the first modern theory of native and denatured proteins was published, where Alfred Mirsky and Linus Pauling (1901–1994) stated that the loss of certain highly specific properties constitutes the most significant change that occurs in the denaturation of a native protein
[43]. By 1944, it became clear that native proteins have unique structures, that the denaturation processes are manifold in nature and magnitude, and that the addition of high concentrations of strong denaturants, such as guanidine hydrochloride (GdmHCl) or urea, to a protein causes a complete (or almost complete) disruption of all conformational interactions and, as consequence, to the transformation of a protein molecule into the highly disordered state of a random coil
[44]. Furthermore, the authors of this seminal review stated: “The term denaturation has been used rather loosely and indiscriminately to denote ill-defined changes in the properties of proteins, caused by a variety of chemical, physical, and biological agents. The observation that many unrelated processes may cause similar changes in a protein early led to the belief that any single change, such as the formation of a coagulum, suffices to characterize a ‘denatured’ protein, and that all denaturing agents are alike in their action. Although proteins are now known to respond differently to various kinds of denaturation, the supposition of the singleness of the denaturation process has persisted”
[44].
They also defined denaturation as “any non-proteolytic modification of the unique structure of a native protein, giving rise to definite changes in chemical, physical, or biological properties”
[44]. It is obvious that a clear distinction should be made between the terms “denaturation” and “unfolding”. Here, as defined above, denaturation is a process leading to the elimination of protein functionality resulting from the disruption of functional 3D structure. This can be triggered by a wide range of conditions, with the resulting denatured forms possessing a wide spectrum of properties depending on the conditions in which they were achieved. On the contrary, protein unfolding is defined as a process leading to the complete elimination of all the conformational forces stabilizing the native protein structure, resulting, therefore, in the formation of a coil-like conformation.
Retrospectively, finding partially folded species of globular proteins under a variety of denaturing conditions should not be surprising. This is because the unique 3-D structure of a protein molecule is stabilized by specific non-covalent interactions, such as hydrogen bonds, hydrophobic interactions, electrostatic interactions and salt bridges, and van der Waals interactions. Since these conformational forces have different physical natures, it is quite possible that they would react differently to the changes in the environment, where under specific conditions, some forces would decline and dissipate, whereas others would stay unchanged or even strengthen. In these cases, the protein molecule is obviously losing its biological activity; i.e., it is becoming denatured, but since not all the conformational forces are “shutdown”, denaturation is not necessarily accompanied by the complete unfolding of a protein, giving rise to the appearance of new conformations with properties halfway between those of native and completely unfolded states. Therefore, various degrees of denaturation/unfolding must exist, depending on the extent to which the structure of the protein has been modified under given conditions. Clearly, the fact that the extent of denaturation can be different is incompatible with the “all-or-none” hypothesis that a given protein can exist in only one of two states, the completely native or the completely denatured/unfolded
[44].
These important considerations were rooted in the experimental evidence accumulated in the 1930s and 1940s, when the incomplete unfolding and existence of some intermediate stages of denaturation were recognized in several instances
[45][46][47][48][49]. Furthermore, as follows from later studies, some denatured forms produced at milder denaturing conditions (e.g., heat- or pH-denatured proteins) can undergo additional structural alterations in the presence of strong denaturants, such as urea or GdmHCl
[50]. Therefore, since the final denatured conformations of proteins are strongly dependent on the denaturing agent, not all denatured states are structurally similar, and under certain conditions the protein molecules are not completely unfolded.
These very logical conclusions were formulated in a classical review by Charles Tanford (1921–2009)
[9], which was one of the first papers providing in-depth analysis of the possibility that during the unfolding of globular proteins, accumulation of some equilibrium intermediate states might be expected. Unfortunately, since the results that were available at that time were too scanty, no serious generalization could be made. Furthermore, the vast majority of then reported studies suggested that accumulation of an intermediate during protein unfolding was regarded as an exception to the rule, whereas a conformational transition described by a two-state model represented the “normal” response of a protein to changes in its environment. Although proteins were shown to respond differently to various kinds of denaturation, the supposition of the singleness of the denaturation process persisted
[44].
For the first time, an intermediate state accumulating during the unfolding process was identified as early as in 1973 by Tanford’s group while looking at the chemical unfolding of bovine carbonic anhydrase B (BCAB) by GdmHCl
[51]. It is notable that the intermediate state identified by far- and near-UV circular dichroism (CD) spectroscopy was described as having the secondary structure of the native state but as having lost the tertiary structure
[51]. A year later, Kin-Ping Wong and Larry M. Hamlin used circular dichroism, difference spectrophotometry, enzymatic activity, and viscosity to study acid denaturation of these proteins and showed that the denatured acid BCAB was enzymatically inactive and did not have a unique 3D structure as judged by near-UV CD; it also did not exist in the random-coiled state as indicated by viscosity and far-UV CD
[52]. Around the same time, Pititsyn’s group
[27] initiated their work, which eventually led to further early insights into the folding intermediate. It was suggested that the formation of a native-like secondary structure preceded the proteins acquiring their tertiary structures. The results of the analysis of acid- and temperature-induced denaturation from this group were found to support this notion
[10][11][13]. It was Ohgushi and Wada who in 1983 coined the term “molten globule” to describe such folding intermediates
[12].
The most defining characteristics of a “classic” MG are outlined below
[14][15][16][25][53][54][55][56][57][58][59][60][61][62][63]. A protein molecule in the MG state is characterized by the presence of a significant secondary structure (which is often classified as native-like secondary structure) with no or little tertiary structure (tight packing of side chains of amino acid residues is absent). Furthermore, 2D-NMR coupled with a hydrogen-deuterium exchange showed that the protein molecule in the MG state is characterized not only by the native-like secondary structure content, but also by the native-like folding pattern
[64][65][66][67][68][69][70][71][72][73]. Small-angle X-ray scattering (SAXS) analysis revealed that the molten globular proteins possess globular structure typical of native globular proteins
[74][75][76][77][78][79]. In agreement with the preservation of globular structure, the protein molecule in this state is characterized by a high degree of compactness, as its expansion typically leads to a general increase of 10–20% in radius of gyration or a hydrodynamic radius (over the native state), which corresponds to the volume increase of ~50%
[55][80].
A considerable increase in the accessibility of a protein molecule to proteases was noted as a specific property of the MG state
[81][82][83][84][85][86][87]. There was also an increase in the solvent exposure of the hydrophobic core, which was now less compact than the core of a native globular protein. This was reflected in the characteristic capability of the MG to specifically bind a hydrophobic fluorescent probe 1-anilino-naphthalene-8-sulfonate (ANS) or 1,1′-Bis(4-anilino-5-naphthalenesulfonic acid) (bis-ANS)
[88][89][90]. MGs can show substantial levels of structure in some cases
[53]. Lynne Regan reported that one part of a protein can retain the native structure, whereas another part forms an MG
[91]. That is expected as proteins in general are characterized by noticeable structural heterogeneity, and conformational stability/flexibility can vary across the protein regions
[92][93][94]. The abundant existence of intrinsically disordered proteins (IDPs) with various levels of disorder, and the presence of intrinsically disordered protein regions (IDPRs) in numerous proteins serve as extreme examples of this phenomenon
[81][92][93][94].
While earlier data on the denaturation/unfolding and refolding of small proteins were compatible with the two-state model comprised of N → D and D → N transitions, the fact that many proteins were shown to form MGs during their unfolding indicated that the reality was more complex, and one should consider protein unfolding as the sequential process N ↔ MG ↔ U. This clearly raised a question on the physical and thermodynamic nature of the corresponding N ↔ MG and MG ↔ U transitions. The answer to this question was retrieved first from the results of the multiparametric experimental analysis of equilibrium GdmHCl-induced unfolding of BCAB and
S. aureus β-lactamase at 4 °C, which clearly showed that the molten globule was separated from the more unfolded states by the “all-or-none” transition (this was evidenced by the bimodal distribution function of the molecular dimensions within the transition from the molten globule to the unfolded state)
[95].
Later, similar bimodal distribution in the HPLC gel-filtration profiles was observed within the unfolding pathways of the NAD
+-dependent DNA ligase from the thermophile
Thermus scotoductus [22][23]. In line with these observations, an analysis of then available data on the equilibrium urea- and GdmHCl-induced N → U, N → MG, and MG → U transitions of globular proteins revealed that the cooperativity of all these unfolding processes increased linearly with the increase of the molecular weight of the protein up to 25–30 kDa. This indicated that the solvent-induced transitions from the native to the unfolded state, from the native to the molten globule state, and from the molten globule to the unfolded state were characterized by an “all-or-none” nature, thereby suggesting that the molten globule represented a third thermodynamic state of a protein molecule
[96][97]. The validity of this model was later supported by Vijay S. Pande and Daniel S. Rokhsar, who in 1998 analyzed the equilibrium properties of proteins with Monte Carlo simulations and showed that, in addition to a rigid native state and a nontrivial unfolded state, a generic phase diagram contained a thermodynamically distinct MG state, further supporting the idea that MG represented a third phase state of proteins
[98].
3. Potential Functionality of Folding Intermediates
Even before the acknowledgement of the prevalence and biological importance of intrinsically disordered proteins with their considerable structural heterogeneity, it was recognized that folding intermediates, including MGs, might have biological relevance. One of the first notes about this scenario was a hypothesis that the MG state may be involved in the translocation of proteins across membranes
[99]. This idea was successfully supported by experiments, and there is now enough evidence that translocation of proteins and their insertion into membranes involve the MG state
[100][101][102][103][104]. Model systems with α-lactalbumin showed the binding of MG to lipid bilayers
[105]. In general, globular proteins can be transformed into the MG states on interaction with the membrane surface
[106]. Such N → MG transitions in the vicinity of a membrane can be induced by the action of the so-called “membrane field”, which is a combination of the local decrease in the effective dielectric constant of water near the organic surface with the effect of negative charges located on the membrane surface
[107][108][109][110]. Release and loading of the large, tightly packed hydrophobic ligands from and to the globular proteins might be facilitated by the partial unfolding of the carrier (N → MG transition) resulting from the concerted action of the moderate local decrease of pH and of the dielectric constant in proximity to the target membranes
[111].
Furthermore, many proteins responsible for the transport of large hydrophobic ligands might have MG properties in their preloaded apo-forms
[112][113][114]. It was also shown that many carbohydrate- and amino acid-binding periplasmic protein in
E. coli form molten globule, which bind to their respective ligands
[115]. Chaperonins interact with MGs and prevent their aggregation
[116]. Earlier, Martin et al. discussed how a chaperonin-mediated folding had an MG as an intermediate
[117]. It was also pointed out that compact, MG-like intermediates are localized within a central cavity of the chaperonin GroEL
[118][119][120]. Facilitated folding of actins and tubulins occurs via a nucleotide-dependent interaction between the cytoplasmic chaperonin and the distinctive folding intermediates
[121]. The presence of MG during nascent peptide folding has been inferred
[122].
Importantly, although aforementioned functionalities have been attributed to the MG-like conformations, the major emphasis of all these and similar studies was still focused on the assumption that these functional MGs were folding intermediates kinetically trapped by the chaperonins just after the protein biosynthesis but before proteins become completely folded
[25][99][123] or appear as a result of point mutations preventing polypeptides from complete folding
[25][124] or originate from the denaturing effects of the membrane field
[99][100][101][102][103][104][105][106][107][108][109][110] or ligand binding or release
[112][113][114]. However, the presence of MGs in the cells become an established fact.
[125]. All these observations provided strong support to the validity and importance of the concept of MG as a folding intermediate of globular proteins in vivo.
4. How Can One Find Molten Globules, and Where Can They Be Found?
MGs of globular proteins are generally obtained by their mild denaturation that can be induced by acid, alkali, low to medium concentrations of chemical denaturants such as urea and GdmHCl, chaotropic salts, moderately high temperature, and, for some proteins, even by low temperature
[126][127][128][129][130][131][132][133][134][135][136][137][138][139][140][141][142][143][144][145]. Later studies revealed that in some proteins, an MG can also be induced by various organic solvents
[146][147][148][149]. However, it was also shown that fluorinated alcohols can preferentially stabilize α-helices leading to the formation of non-native helical structures in some all-β-sheet proteins. For example, such highly helical states were induced by 2,2,2-trifluoroethanol (TFE) in several all β-sheet proteins, such as cardiotoxin analogue II (CTX II), from the Taiwan cobra (
Naja naja atra)
[150], procerain, a cysteine protease from Calotropis procera
[151], β-lactoglobulin
[152][153][154][155] and mellitin,
[152][155] to name a few. All β-sheets to mostly α-helical structure in β-lactoglobulin and mellitin were also induced by hexafluoroisopropanol (HFIP), as well as by non-fluorinated alcohols, isopropanol, ethanol, and methanol
[152][155]. Curiously, it was pointed out that an alcohol-induced α-helical state of β-lactoglobulin structurally resembles a transiently populated folding intermediate with high levels of non-native α-helical structure, which is formed within a few milliseconds during the refolding of this protein
[156], suggesting that an intermediate with the non-native α-helical structure can accumulate during the refolding process of β-lactoglobulin, emphasizing that the hierarchical model cannot correctly describe folding of some β-structural proteins, including β-lactoglobulin
[154][156].
The secondary and tertiary structures were evaluated generally by far- and near-UVCD, respectively. Secondary structure can also be evaluated with Fourier-transform infrared spectroscopy (FTIR) or optical rotatory dispersion (ORD), whereas viscosity measurements, gel-filtration chromatography, dynamic light scattering (DLS), SAXS, and electron microscopy are used to track expansion of the molecular volume
[61][62]. The decrease in the compactness accompanied by the increased solvent accessibility of the hydrophobic core is normally estimated by looking at the binding of the fluorescent dye ANS to a protein molecule
[88][89][90]. However, it was also pointed out that since ANS and bis-ANS have a strong affinity to the partially folded MG state, they can shift the equilibrium from favoring the native state (N) to favoring the MG state
[89]. As a result, the apparent destabilization of the native state is observed, as was shown for the nucleotide-binding chaperonin DnaK
[89]. On the other hand, binding of ADP or ATP to the native state of this protein resulted in a shift of the equilibrium from the MG toward the N state
[89]. Furthermore, as early as 1995, Anthony L. Fink (1943–2008) cautioned that “It is important to note that the presence of ANS tends to increase the propensity of molten globules and compact denatured states to aggregate, and that aggregation increases the ANS fluorescence emission”
[62].
Some other techniques like hydrogen-deuterium exchange, NMR, X-ray, isothermal titration calorimetry (ITC), differential scanning calorimetry (DSC), and computational methods have also been increasingly applied in later years
[71]. In general, all the techniques/methods applicable to looking at protein structure and stability can give valuable information about partially folded intermediates like MGs
[157]. For example, various fluorescence techniques, such as analysis of the intrinsic and extrinsic fluorescence (both steady-state and time-resolved), fluorescence anisotropy, Förster resonance energy transfer (FRET), dynamic and static fluorescence quenching, and proteolytic susceptibility are also used quite often
[158].
In additional to classical examples of α-lactabumin, BCAB, and β-lactamase, both equilibrium and kinetic (transient) MGs have been described for a number of proteins and their mutants
[159][160][161]. One interesting comparison is between the MGs formed by α-amylases from a thermophile and those formed from a mesophile
[162]. This analysis revealed that the MG of the thermophile was more stable, which is not surprising. The polyols were less effective in refolding of the MG of the mesophilic enzyme
[162].
Another interesting class of proteins are from halophiles. These generally require >0.5 M KCl to be functional. In several cases, these proteins just like those from thermophiles are fairly stable towards unfolding. The mechanism of halo-adaption was investigated by Gloss et al.
[163] by looking at the kinetics of folding of urea denatured dihydrofolate reductases (DHFR) from
E. coli and a halophile. In both cases, after a burst intermediate, formation of two intermediates was detected. The data was consistent with salt ions destabilizing the unfolded states in both cases. The authors concluded that halo-adaption involves affecting the solvent via a hydrophobic effect, the Hofmeister effect, preferential hydration, and crowding. This is in line with the X-ray crystallography and structural data that showed extensive solvation but little salt binding in the case of many halophilic proteins
[163].
Yet another example of complexity in halo-adoption by halophile proteins is the role of protein hydration
[164]. Given their higher surface charge density, it is widely believed that these are highly hydrated even in their native forms. This excessive hydration was expected to be responsible for the exceptional stability of corresponding proteins under saline conditions. The results obtained with an engineered protein with a high number of acidic residues on its surface suggested that not only was the surface hydration of a halophilic protein not much larger than that of a mesophilic counterpart, but even its hydration dynamics during unfolding was not very different
[164].
Study of the proteasome from the extremely halophilic archaeon
Haloarcula marismortui revealed that while other enzymes unfolded under sub-saline conditions, the proteasome was more resistant
[165]. The biological significance of this is that it underlines how proteasome degrades the damaged proteins under sub-saline conditions as the stress situation for the organisms
[165].
Uversky compared the stabilities of proteins from mesophiles with those from halophiles, thermophiles, and barophiles while advancing a hypothesis about the role of protein dielectricity in affecting the solvent properties in the context of protein-protein interactions
[166]. The research mentions the earlier work with β-lactoglobulin, in which it was reported that the molten globule formation by the protein in alcohol-cosolvent mixtures was directly dependent on the decrease in the dielectric constant of the water as a result of mixing the simple alcohols
[109]. Interestingly enough, in an independent observation, Gupta et al. around the same time observed that for a number of proteins, the enzyme stability in aqueous-organic cosolvent mixtures was dictated by the polarity index of the organic solvent
[167]. Solvents with a polarity index of 5.8 and above were good cosolvents, which did not destabilize the protein even when up to 50% (
v/
v) is added to the aqueous buffer
[167]. Both dielectric constants and polarity indexes are measures of solvent polarity.
Another interesting observation has been reported about MG formed by chymotrypsinogen
[168]. A single cysteine reacts with glutathione at a very rapid rate. Such hyperactive cysteine residues are also present in serum albumin, lysozyme, and ribonuclease
[168]. However, cysteine present in two proteins of a thermophile (in which glutathione is absent) did not display this hyper-reactivity. The authors infer that this unusually high reactivity of cysteine residues is relevant to the oxidative refolding of proteins in the organisms, which have oxidized glutathione-reduced glutathione system
[168].
5. Molten Globules and Intrinsic Disorder in Proteins
Coming back to the hypothesis on the potential role of protein dielectricity in affecting the solvent properties mentioned earlier
[166], in the context of functional relevance of partially unfolded protein intermediates, it was proposed that a protein lowers the dielectric constant of the local medium around its interface with the aqueous solvent/water rich medium. This facilitates the behavior of proteins acting as “unfoldases”. Many proteins, in order to be functional, have to be unfolded (then referred to as conditionally disordered proteins)
[169][170]. In many cases, this conditional unfolding is initiated by the interacting protein, which acts as an unfoldase by lowering the local dielectric around it; this leads to the binding between the two as a part of a biologically relevant process. Examples include unfolding of BCL-xL while interacting with the intrinsically disordered PUMA, which in turn folds upon binding as entropic compensation
[166]. Unfoldases also include ATP-dependent proteases (such as in proteomes) and molecular chaperonins. Early examples in which this unfoldase behavior was observed were pore-forming domains of some toxins and carrier proteins of large nonpolar ligands. The aggregation including where it leads to amyloid formations (and is responsible for many diseases) may also be initiated by protein lowering the dielectric around it. Few other examples relevant to this are available
[170][171][172][173]. Therefore, this hypothesis provides a common thread running through diverse phenomena
[166]. Interestingly enough, later work has confirmed that functionally relevant unfolded structures of many bacterial toxins are molten globules
[174][175][176].
One should keep in mind that intrinsic disorder in proteins represent a highly heterogeneous phenomenon, and functional IDPs can be disordered to different degrees. In fact, the existence of native (i.e., functional) coils, PMGs, and MGs was reported
[54][177][178][179][180]. Furthermore, different parts of a protein molecule can be disordered to different degrees, and a functional protein can contain ordered, molten globular, pre-molten globular, and coil-like domains. What’s more, IDPs/IDPRs (and, as a matter of fact, any protein molecule in general) can be structurally represented as a spatio-temporal combinations of foldons (independent foldable units of a protein), inducible foldons (disordered regions that can fold at least in part due to the interaction with their binding partners), inducible morphing foldons (disordered regions that can fold differently at interaction with different binding partners), semi-foldons (always semi-folded regions), non-foldons (non-foldable protein regions), and unfoldons (regions that undergo an order-to-disorder transition to become functional)
[92][93][181][182]. Another important note is that these functional disordered elements (i.e., foldons, inducible foldons, inducible morphing foldons, semi-foldons, and non-foldons) can structurally exist as coils, PMGs, or MGs.
There is another pointer to the complexity of the process. Bychkova et al. have discussed the differences between an MG and an IDP
[125]. In the latter, there is a greater disruption of local structure; H-D exchange is higher. However, researchers do not have any data regarding a comparison between the two different forms of MGs (WMG and DMG) and IDPs.
There is also an interesting observation that MG-like IDPs can drive liquid-liquid phase separation (LLPS) that leads to the formation of protein condensates
[183]. It is reported that in the case of the replication transcription of respiratory syncytial virus that take place within the “viral factories”, which are liquid-like structures within the cytosol of infected cells, the phosphoprotein tetramer (which is involved in the process and has a highly disordered N-terminal domain and a molten globular C-terminal domain) displays LLPS during a thermal transition, which is accompanied by the folding of the MG domain
[183]. When the phosphoprotein is mixed with a nucleoprotein, which is also a part of the viral replication complex, again a phase separation is observed. Based on their observations, the authors of this study concluded that for LLPS to take place in vitro and in the cell, a weak, MG-like structure must be present, and such a structure defines physicochemical grounds for the LLPS behind the viral replication factory
[183]. This is an interesting observation, as more often, proteins driving LLPS are expected to be either native coils (as shown for many IDPs
[184][185][186][187][188][189][190]) or native PMGs (see, e.g., data for the AB region of human retinoid X receptor subtype γ (hRXRγ)
[191]).