1. Introduction
There are four levels of structural organization of proteins: primary, secondary, tertiary and quaternary structures
[1]. Primary structure refers to the sequence of amino acids. Secondary structure is the local ordering of amino acids under the action of hydrogen bonds. Tertiary structure is the spatial structure of the polypeptide chain. Quaternary structure is the mutual arrangement of several polypeptide chains, relative to each other. Proteins are the most essential and ubiquitous components of any living organism. The role of proteins is very diverse. The main functions of protein are: catalysis, structure and motions, energy provision and the regulation of processes and transport. A number of proteins have a catalytic function. Over 5000 enzymes have been described to date
[2]. Structural proteins perform a supporting function, connecting the tissues of the body to each other, acting as a framework for them
[3]. Contractile proteins are proteins that provide the cell with motor function. Examples of such proteins are actin and myosin. These proteins are part of the muscles, providing the latter with the ability to contract
[4][5]. A number of proteins have a signaling function, that is, the capability to transmit various signals between the cells. For example, cytokines regulate cell functions
[6]. Transport proteins carry various compounds. An example of such a protein is hemoglobin, whose function is to transport oxygen
[7]. It should be noted that the functions of proteins are not limited to the above. The significance of the study of protein structures also follows from this diversity. First of all, these are fundamental studies of the mechanisms of the functioning of protein molecules, which means an understanding of the principles of various physiological processes in the organisms of living beings. Of practical importance is the study of protein structures for medical applications. For example, knowledge of the structure of a number of proteins of pathogenic viruses allows researchers to elucidate complicated virus replication and evasion mechanisms, and create more effective and safe vaccines based on peptides
[8][9]. As a separate topic,
peoplwe should look at the so-called drug-design method, or directed drug design. Currently, due to the rapid growth of computing power available to researchers, the design of drugs using molecular-modeling methods is a promising and dynamically developing area. The main directions of molecular modeling in drug design are methods based on knowledge of the ligand structure and methods based on knowledge of the target structure
[10][11][12]. Recent examples of the application of such an approach are the structural studies of SARS-CoV-2 main protease (Mpro). The apoform of Mpro was solved at the beginning of 2020
[13]. The appearance of this structure in PDB gave rise to a series of research studies on the search for new Mpro inhibitors. More than 20 structures of Mpro complexes with different inhibitors have been deposited in the PDB, to date. In addition, it should be noted that one of the urgent problems of modern enzymology is the prediction of the nature of mutations necessary for a directed change in the specificity of an enzyme. Currently, substrate specificity is explained by the Koshland theory, which states that the topology of the active site corresponds to the topology of the substrate, according to the “key-lock” principle. Accordingly, substrate specificity can be influenced by changing the active site using site-directed mutagenesis. Knowledge of the spatial structure of the enzyme makes it possible to rationally construct mutant forms of the protein with the required substrate specificity. Thus, a number of mutants of enzymes with changed specificity were obtained, which have industrial or medical significance
[14][15][16][17].
The first protein crystal was obtained in 1840
[18]. In 1851, a method was described for obtaining such crystals from erythrocytes
[19]. This protein was later named hemoglobin
[20]. The first diffraction pattern from a hemoglobin crystal was obtained in 1934
[21]. The first spatial structure of a protein was obtained for myoglobin in 1958
[22]. In October 1971, the Protein Data Bank (PDB) appeared. At first it had seven protein structures
[23]. Every year, the number of three-dimensional protein structures deposited there grew rapidly. Currently, the number of structures deposited in the PDB, obtained by experimental methods, exceeds 195,000. It should be noted that at present the PDB also contains spatial structures of polynucleotides. In addition, from 2022, the PDB has contained the spatial structures of proteins obtained by computational methods. Currently, there are more than 1,000,000 such models in the PDB. In most cases, recombinant proteins are used in protein crystallography. At present, a typical experiment to determine the structure of a recombinant protein using X-ray diffraction analysis consists of several stages: obtaining a recombinant protein, its purification, crystallization, the X-ray-diffraction experiment, and solution and refinement of the protein structure. Since the late 1990s flash cooling of protein crystals has been widely used for X-ray data collection. It allows for the reduction of radiation damage, which is essential for achieving the suitable data-collection statistics. The use of flash cooling is absolutely necessary for single-crystal X-ray experiments in modern synchrotrons, where a reduced beam is used to prevent fast-diffraction degradation.
However, flash cooling can distort protein structure and mosaicity. The investigation of most dynamic processes is unrealized in the frozen state of crystals. However, domain motions can still be investigated, using techniques such as TLS (translation- libration- and screw-motion) analysis. Serial microcrystallography developments in recent years have helped to overcome these limitations. A method based on data collection from microcrystal suspensions at ambient temperature is used. Special methods of microcrystal delivery are required. Three main sample-delivery methods are used: crystal-injection methods, fixed-target methods and hybrid delivery methods
[24][25][26][27]. Such experiments are realized in the fourth generation synchrotrons or FELs
[28][29]. The main achievements of this technique are the structures of membrane proteins and time-resolved experiments. In time-resolved experiments, lasers are used as chemical triggers, depending on the nature of the object
[30][31][32][33][34][35]. The importance of anomalous dispersion in the development of protein crystallography should also be noted
[21].
2. Protein-Crystallization Techniques
Currently, a number of methods for protein crystallization have been developed and are widely used. Most protein crystals have been, and are still, grown by solvent vapor diffusion
[36]. The advantage of this method is its simplicity and economy. Crystallization takes place in a hermetically sealed cell containing an undiluted precipitant solution. Water from a drop with a mixture of protein and a precipitant, where the concentration of the precipitant is lower, is distilled into the reservoir solution until the partial vapor pressure over the drop and the surface of the solution is equal. Due to the increase in the concentration of the precipitant and protein, the solution in the droplet becomes supersaturated, and at a certain stage, crystals or an amorphous precipitate appear in it. The method is carried out in two versions—the drop can be “hanging” or “sitting”
[37]. Another widely used method is free diffusion through the liquid–liquid interface
[38]. In this method, a protein solution is carefully layered onto a precipitant solution in a narrow test tube. Due to the significantly higher diffusivity of the salt compared to the protein, in the first stages of mixing the concentration of the precipitant increases to a greater extent than the concentration of the protein. The concentration of the precipitant is selected in such a way that a larger number of nuclei formed in the first stages of mixing dissolve, and a limited number of large crystals grow from a small number of the remaining ones. A variant of this method can be considered as the dialysis method, where the precipitant solution diffuses into the protein solution through the dialysis film, so the protein concentration remains constant
[39]. The dialysis membrane increases the likelihood of nucleation, by serving as a substrate for epitaxial growth. A very simple and convenient method of crystallization is under a layer of paraffin oil
[39]. Another not very common but very fast method for obtaining crystals is crystallization during protein precipitation in an ultracentrifuge
[40]. This method is applicable to proteins of sufficiently large molecular weight. A solution containing protein and a low concentration of a precipitant is placed in a centrifuge tube and centrifuged for 20–40 h at speeds at which the protein slowly sediments to the bottom of the tube. Under the action of acceleration, active transport of the protein to the crystallization zone occurs. As the protein concentration at the bottom of the tube approaches the protein concentration in the crystal, nucleation occurs and then crystal growth occurs. At the same time, the level of supersaturation of the precipitant remains low, and directional acceleration, which promotes a certain orientation of protein molecules, facilitates crystallization. To prevent the crystals from dissolving after the centrifuge is stopped, they must be transferred to a solution with a high concentration of precipitant, the composition of which is selected empirically. Spanish researchers proposed carrying out crystallization through a layer of gel using the method of counterdiffusion in a capillary
[41]. The protein solution was placed in an X-ray capillary, one end of which was closed, and the other end was immersed in agarose gel in a plastic box. The precipitant solution was applied to the agarose gel. Slow diffusion of the precipitant through the gel layer led to the formation of a precipitant concentration-gradient in the capillary, and crystals grew at different distances from the capillary inlet, under different conditions. As a result, the counterdiffusion method allows for the testing of several growth conditions within a single capillary.
Currently, there are still no rational approaches for choosing the nature and composition of the precipitant to obtain a protein in the crystalline state. It is not clear what kind of precipitant and in the presence of which additives, or at what pH values and at what temperature this protein will form a crystalline, rather than amorphous, precipitate. As a result, the least predictable is the initial stage—obtaining a protein in a crystalline form. Precipitant selection is sometimes aided by knowledge of protein behavior and properties, but in general, crystallization conditions are screened using commercial equipment and commercial crystallization-reagent kits. A number of companies offer a large number of sets and various crystallization devices for setting up crystallization. Each kit typically contains 50 or 96 precipitant solutions, which include a pH buffer, the precipitant itself, and low-molecular-weight additives. During screening, a protein solution in a certain proportion is mixed with a solution of a precipitant, and the formed precipitate is examined under a microscope at certain intervals. Nevertheless, researchers are making attempts to study the mechanisms of the formation and growth of protein crystals. The growth mechanisms were studied by electron- and atomic-force microscopy, as well as by interferometry, using Michelson and Mach-Zehnder interferometers
[42][43][44]. It was shown that macromolecular crystals grow using the same mechanisms as crystals of other molecules; however, during the growth of protein and virus crystals, a new mechanism, unknown for small molecules was discovered—growth by direct addition and subsequent development of whole three-dimensional nuclei. In addition, a number of attempts were made to study the structure of precrystallization solutions by small-angle X-ray scattering
[45][46].
Despite well-developed methods of crystallization and an understanding of the general patterns of growth of protein crystals, a number of proteins cannot be crystallized or only poorly diffracting crystals can be obtained. In such cases, it is advisable to use protein-engineering methods to increase the probability of the formation of additional intermolecular contacts in the crystal lattice
[47]. To create new intermolecular contacts, site-directed mutagenesis replaces individual amino-acid residues on the surface of a protein molecule. If it is necessary to increase the solubility of the protein, some hydrophobic residues on the surface can be replaced with polar ones
[48]. It has been noted that proteins have entropy surface-protection: the presence of charged lysine and glutamic-acid residues on the surface prevents the formation of nonspecific aggregates and precipitation
[49]. Replacing them with alanine or other small residues reduces the surface conformational-entropy. In this way, crystallization conditions can be optimized. New intermolecular contacts can also be created by the chemical modification of individual amino-acid residues, for example, by the acetylation of lysine residues
[50]