The fundamentals of how protein–protein/RNA/DNA interactions influence the structures and functions of the workhorses from the cells have been well documented in the 20th century. A diverse set of methods exist to determine such interactions between different components, particularly, the mass spectrometry (MS) methods, with its advanced instrumentation, has become a significant approach to analyze a diverse range of biomolecules, as well as bring insights to their biomolecular processes. Cross-linking mass spectrometry (CLMS) holds promise to identify interaction sites in larger and more complex biological systems.
Decades of research into the cell biology, molecular biology, biochemistry, structural biology, and biophysics have produced a detailed understanding of individual DNA/RNA/protein molecules, and their interconnected networks. A great diversity of techniques has emerged for studying their structural interactions. However, even more complex, the structures of these workhorses from the cells are themselves dynamic, converting from one dominant form to another based on the proportions of particular proteoforms present for any given biomolecules. Accordingly, beyond the protein–protein/DNA/RNA interaction landscape, there is an entire universe to explore with respect to their structure and dynamics. One such high-throughput technique has emerged as a dominant player in understanding both interaction landscapes and their resulting protein/DNA/RNA structures, namely cross-linking mass spectrometry (CLMS). This review covers different methods that are available to study protein–protein/DNA/RNA interactions, and provides vital insight into CLMS, a collection of methods that are perfectly suited to achieve a better understanding of intra- or inter- molecular interactions.
Protein–protein interactions (PPIs) play a crucial role in all biological/biomolecular processes to understand the molecular mechanism of relevant protein molecules; hence, they are often termed the workhorses of cells. Over 80% of proteins do not function in isolation, but rather exist in interactions with one another to obtain stable or transitory complexes, as demanded by their observed function [1][2]. PPIs are considered to be an emerging class of drug target, since aberrant protein–protein interactions can participate in the pathogenesis of various human diseases, which, in turn, can contribute significant options for diagnostic as well as therapeutic targets. A large number of experimental methods have been developed to study PPIs, based on biophysical, biochemical, or genetic principles (as shown in Figure 1); and, each individual type has advantages as well as limitations regarding structural coverage of amino acid sequences, sensitivity, and specificity [1][3][4][5][6]. Methods can be chosen to emphasize different aspects of PPIs, such as identifying a protein binding partner(s), generating structural details of protein complexes, analyzing kinetic and thermodynamic constants of interactions, visualizing and quantifying PPIs in real time in living cells, and mapping small interactomes that refer to specific cellular pathways [1][3][4][5].
Mass spectrometry (MS) is one of the powerful approaches available that is useful in several areas beyond CLMS (e.g., hydrogen-deuterium exchange [7]), and it is particularly useful in combination with other techniques, providing steady progress for structural biology. Technical advances in mass spectrometry have made it possible to study protein–protein interactions from simple protein complexes to wide scale proteome experiments, which were not earlier accessible by traditional techniques [8][9][10]. Above all, mass spectrometry methods have democratized protein interaction analysis by making them accessible, relatively inexpensive, and high throughput. Having these advantages MS is becoming progressively popular in the structural biology stream for analyzing three dimensional (3D) structures and mapping their interactions with partner molecules. While emerging techniques, like cryoEM, can summarize large numbers of protein complexes, much of the resulting dynamics of the structure is missed and such gaps in structural data sets can be bridged by low-resolution methods, for example, chemical cross-linking (CL) [11][12][13]. The overall architecture of a protein complex can be obtained through electron microscopy (EM) [14], small-angle X-ray scattering (SAXS) [15], and ion-mobility (IM) MS [16], whereas the precise residues forming protein–protein interactions can be identified using hydrogen-deuterium exchange [17], chemical cross-linking [11][12][13], and chemical foot printing [18].
Fundamentally, the mass spectrometry methods simply measure a mass-to-charge ratio of an ion. Initially, analytes are ionized and then transferred into the gas phase prior to their separation according to mass-to-charge ratios in a mass analyzer. Subsequently, ions that emerge from the mass analyzer are recorded using a detector. The most common way to ionize a peptide or protein sample is electrospray ionization (ESI) [41]. However, a complementary ionization technique, known as matrix-assisted laser desorption/ionization (MALDI), is often used for its relative ease of use for novices. In a few studies analyzing intact protein complexes, MALDI mass spectrometry (MALDI-MS) has been used together with chemical cross-linking techniques [42][43][44]. Regardless of ionization methods, there are various types of mass analyzers accessible, including the following common ones: ion trap, time-of-flight (TOF), quadrupole, and orbitrap. Each of these mass analyzers may be configured in various ways with similar or different mass analyzers to form unique types of mass spectrometers [45].
Quantitative cross-linking mass spectrometry (QCLMS) approaches investigate protein structures as well as the dynamics of their interactions [46][47][48][49][50][51]. QCLMS is often performed using a unique cross-linker that introduces a corresponding mass shift after isotope labeling specific only to the cross-linked peptides [46][47][50][52][53], followed by quantitation of cross-links in MS1. However, the limited availability of isotope labeled cross-linkers restrains the implementation of this approach in QCLMS [54]. Stable Isotope Labeling by Amino Acids in Cell Culture (SILAC), which is a popular option is an alternative to the QCLMS approach, which relies on the metabolic incorporation of isotope labeled amino acids from culture media. Metabolically labeled samples are cross-linked and pooled; relative quantitation is performed from MS1 data using the characteristic mass shift introduced into peptides from the incorporation of isotope labeled amino acid/s [48]. SILAC enables a comparison of multiple samples per analysis (usually two) and it can also enable monitoring amino acid incorporation in a time course (pulsed SILAC), which is especially valuable for exploring dynamics in biological processes. Recently, iCLASPI (in vivo cross-linking assisted and stable isotope labeling by amino acids in cell culture (SILAC)-based protein identification), an approach combining SILAC and in vivo cross-linking, has been implemented to quantify native protein–protein interactions in HEK293T cells. iCLASPI has been successfully implemented to profile native protein–protein interactions involving core histones H3 and H4 in biological context [55]. Chavez et al. successfully implemented SILAC and cross-linking to investigate key protein–protein interactions, and then investigated the Hsp90 conformational changes upon treatment with 17-AAG Hsp90 N-terminal domain inhibitor [49].
Isobaric Tags for Relative and Absolute Quantitation/Tandem Mass Tag (iTRAQ/TMT) are chemical labeling, which are introduced to peptides after protease digestion, which allow relative protein quantitation via MS2 or MS3 encoded data. Nowadays, TMT/iTRAQ labeling enables comparison of up to 16 samples (valid for TMT) in a single MS analysis. Notably, the iTRAQ/TMT reporter ions used for quantitation are cleaved from labeled peptides during fragmentation by way of collision-induced dissociation allowing for the quantitation from fragment ion relative intensities. Yu et al. [56], implemented the TMT approach in a multiplexed comparison of protein complex dynamics and protein–protein interactions. Their QMIX (Quantitation of Multiplexed, Isobaric-labeled cross (X)-linked peptides) workflow with TMT labeling, achieves peptide quantitation from MS3 data that eliminates interference from ions that were observed in MS1 data along with isotope labeled cross-linkers or SILAC cross-linking [56]. Furthermore, the precise MS2 quantitation of protein cross-linking could also be achieved in a label-free manner utilizing a data-independent acquisition mode (DIA). The extraction of cross-linked peptide quantities from DIA data is usually performed utilizing a spectral library prepared particularly from investigated samples. Muller et al. developed a novel DIA-QCLMS approach utilizing photo activatable cross-linkers ensuring reliable quantitation of cross-linked proteins across a wide range of environmental changes, such as pH, temperature pressure, or concentration [51][52][57]. DIA approaches promise to be useful in future applications of quantitative cross-linking proteomics, due to their precision, reproducibility, and label-free manner.
Moreover, techniques, such as proximity-dependent biotin labeling (or BioID technique) in living cells, help to understand the plasticity of protein networks within heterogeneous cellular populations. In combination with nanopore technology, such an approach could help tackle pending biological questions, e.g., the identification of peptidyl-prolyl isomerases (PPIases) substrates. PPIases substrates preserve their primary structure/molecular mass, as well as the cis and trans isomers of the proline peptide bonds of the substrates interconversion by PPIases being the sole change. This subtle modification triggers important changes to the substrate’s fate, such as subcellular translocation, degradation, or rewiring of their protein–protein interaction networks. PPIase enzymes act as central molecular switches, as exemplified by the peptidyl-prolyl cis-trans isomerase NIMA-interacting 1 (Pin1) that has been extensively studied and showing its involvement in multiple diseases [58][59][60]. Herein, we propose that the CLMS techniques, though, which have not been previously carried out on such problems, could be merged with novel interactomics techniques (proximity-dependent labeling by BioID or an engineered biotin ligase by TurboID technique). Combining these two approaches may bring spatial resolution to CLMS at a sub organelle level, since the BioID radius is estimated ~10 nm and CLMS is aimed on the proteins in the neighborhood of a given bait.
While there has been some success in the proteome wide CLMS [61][62][63][64], in the case of both chemical cross-linking MS and the related techniques of native MS [65][66], a prior purification of single protein or protein complex is typically necessary, and this can be acquired through over expression with the purification of a recombinant version of particular system. When investigating the protein assemblies, individual components need to be purified to reconstitute the whole complex later in vitro and, alternatively, in vivo reconstitution can occur by co-expressing various subunits [67]. Because the reconstituted systems mostly benefit from large yields to aid the structural analysis and they are frequently done using bacteria (mostly E. coli) as the host system, functionally important post-translational modifications and interacting protein partner associations may be lost during this process. Thus, various biochemical approaches must be explored when beginning a new project in order to directly enable the isolation of endogenous protein complexes from cells or tissues [68].
Modern mass spectrometry that is coupled with the chemical cross-linking of juxtaposed amino acids can provide important structural information. Two main types of the cross-linking strategies involving either the activation of the cross-linking reagent by UV or chemical methods to enable cross-linking [11][12][13][69]. Chemical cross-linking is a classical approach for determining protein–protein interactions and is also one of the first approaches that has been used to map large complexes, for example, the ribosome [2]. Generally, the cross-linking techniques link two or more proteins present in a complex by covalent bonds and, as the name implies, via a molecule designed to bridge juxtaposed amino acids, i.e., to chemically cross-link residues. The chosen cross-linker is a chemical reagent that contains two or more reactive groups connected through a spacer or linker of various lengths [70]. By using this method, low-affinity protein–protein contacts, or some specific interactions, can be detected that are difficult to characterize by other methods (e.g., nuclear magnetic resonance (NMR), X-Ray, etc.). Moreover, the cross-linking techniques have also been applied to stabilize transient protein–protein interactions in a dynamic process both in vitro and in vivo [1][2]. However, there can be considerable weakness in these chemical methods that are related to the lack of spatial localization in a cell and a lack of control over activity. Thus, to evaluate PPIs as similarly as possible to the native conditions, photo-cross linking methods are valuable due to their ability to generate reactive species in situ instantaneously by irradiation with UV-light [11][12][13][69].
Cross-linking is always followed by other downstream methods to further analyze the cross-linked proteins, often using sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) to separate the cross-linked from non-cross-linked proteins, tandem affinity purification (TAP) or immunoaffinity chromatography for affinity-based purification of cross-linked products, and mass spectrometry methods for the interacting partner identification [71][72]. Additionally, SDS-PAGE analysis is very helpful in the early stages of analysis when empirically working out the correct ratio of cross-linker to protein complex. A limitation of using chemical cross-linking while using these methods is the high risk of detecting non-specific interactions. However, these limitations can be addressed using more than one cross-linker of differing activities or spanning various distances and by varying the ratio of reagent to protein complex. Non-specific interactions can result from proteins in close proximity that may not be functionally related. These suggest that while the CLMS technique is relatively straightforward to implement, the identification of relevant cross-linked proteins could be quite demanding due to the intracellular dynamic range of expression of proteins, which can range from one to one million copies low abundance of cross-linked species [1].
Innovative developments in the biological applications of MS led to the development of a large number of methods, and it has made it relatively simple to identify proteins alone or in complexes using CLMS technique. Large macromolecular complexes like ribosomes or exosomes, have been purified and analyzed directly using mass spectrometry [73][74][75][76]. Recently, chemical cross-linking combined with mass spectrometry based structural techniques have hit their stride allowing for various biologically relevant molecular machines to be successfully studied in the past few years using this combination [77]. The various workflows that have been developed to implement CLMS represent a vast toolkit that can help to provide novel insight into the structure and organization of proteins in order to define protein–protein interactions and probing PPI interfaces.
This entry is adapted from the peer-reviewed paper 10.3390/biom11030382