The genome of living cells is continuously exposed to endogenous and exogenous attacks, and this is particularly amplified at high temperatures. Alkylating agents cause DNA damage, leading to mutations and cell death; for this reason, a class of enzymes known as alkylguanine-DNA-alkyltransferases (AGTs) protects the DNA from mutations caused by alkylating agents, in particular in the recognition and repair of alkylated guanines in O6-position. The peculiar irreversible self-alkylation reaction of these enzymes triggered numerous studies, especially on the human homologue, in order to identify effective inhibitors in the fight against cancer. In modern biotechnology, engineered variants of AGTs are developed to be used as protein tags for the attachment of chemical ligands. In the last decade, research on AGTs from (hyper)thermophilic sources proved useful as a model system to clarify numerous phenomena, also common for mesophilic enzymes. This review traces recent progress in this class of thermozymes, emphasizing their usefulness in basic research and their consequent advantages for in vivo and in vitro biotechnological applications.
In the last decade, the O6-alkylguanine-DNA-alkyltransferase from Saccharolobus solfataricus (SsOGT) has been characterized through detailed physiological, biochemical, and structural analysis. Due to its intrinsic stability, the SsOGT protein has proven to be an outstanding model for clarifying the relationships between function and structural characteristics. As all AGTs, its reaction mechanism is based on the recognition of the damaged nucleobase on DNA , followed by a one-step reaction, in which the alkyl group of the damaged guanine is irreversibly transferred to a cysteine residue in its active site. For these reasons, they are also called suicide or kamikaze proteins, showing a 1:1 stoichiometry of their reaction with the natural substrate. The disadvantage of this elegant catalysis is that, upon alkylation, the protein is self-inactivated and destabilized, triggering its recognition by cellular systems to be degraded by the proteasome .
Saccharolobus solfataricus (previously known as Sulfolobus solfataricus) is a microorganism first isolated and discovered in 1980 in the Solfatara volcano (Pisciarelli-Naples, Italy) , which thrives in volcanic hot springs at 80 °C and a pH 2.0–4.0 range. In order to protect its genome in these harsh conditions, S. solfataricus evolved several efficient protection and repair systems . S. solfataricus is highly sensitive to the alkylating agent methyl methane sulfonate (MMS), showing a transient growth arrest when treated with MMS concentrations in the range of > 0.25 mM to 0.7 mM . Interestingly, although the ogt RNA level increases after MMS treatment, the relative enzyme concentration decreases, suggesting its degradation in cells in response to the alkylating agent and, in general, to a cellular stress . Under these treatment conditions, however, the protein level rises after few hours, and, in parallel, the growth of Saccharolobus starts again , indicating a role of SsOGT in efficient DNA repair by alkylation damage.
Various assays to measure AGT activity are reported in the literature. The first methods were based on the use of oligonucleotides carrying radioactive (3H or 14C) O6-alkylguanine groups. Proteinase K digestion was then carried out to measure the levels of marked S-methyl-cysteine in the lysate in an automatic amino acid analyser . A very similar, but simpler and faster radioactive assay was used in another procedure with a 32P-terminal labelled oligonucleotide containing a modified guanine in a methylation-sensitive restriction enzyme sequence (as Mbo I). The AGT DNA repair activity thereby allowed the restriction enzyme to cut . This procedure was also used by Ciaramella’s group to identify for the first time the activity of SsOGT . This test has the advantage of analysing the digested fragment directly by electrophoresis on a polyacrylamide gel .
It was therefore improved in terms of precision by the subsequent separation of the digested oligonucleotides by HPLC. The chromatographic separation allowed the calculation of the concentration of active AGT after measuring the radioactivity of the peak corresponding to the digested fragment . Similarly, Moschel’s group developed the analysis of hMGMT reaction products based on HPLC separation in 2002. This test investigated the degree of inhibition of oligonucleotides with O6-MG or O6-BG in different positions that varied from the 3’ to the 5’ end and whether they could be used as chemotherapy agents. IC50 values were obtained by quantifying the remaining active protein after the radioactive DNA reaction .
Although the assay measures the protein activity, the use of radioactive materials and chromatographic separations made these assays long, tedious, and unsafe.
An alternative approach was proposed in 2010 by the group of Carme Fàbrega, who set up an assay based on the thrombin DNA aptamer (TBA), a single-stranded 15 mer DNA oligonucleotide identified via Systematic Evolution of Ligands by EXponential enrichment (SELEX), which in its quadruplex form binds thrombin protease with high specificity and affinity . In this assay, they put a fluorophore and a quencher to the TBA—the quadruplex structure of this oligonucleotide is compromised if a central O6-MG is present, preventing the two probes to stay closer. An AGT’s repair activity on the oligonucleotide allows the folding of the quadruplex structure and the Förster Resonance Energy Transfer (FRET) energy transfer takes place, resulting in a decrease of the fluorescence intensity .
Recently, the introduction of fluorescent derivatives of the O6-BG (as SNAP Vista Green, New England Biolabs) made possible the development of a novel DNA alkyl-transferase assay. Because AGT covalently binds a benzyl-fluorescein moiety of its substrate after reaction, it is possible to immediately load the protein product on a SDS-PAGE—the gel-imaging analysis of the fluorescence intensity gives a direct measure of the protein activity because of the 1:1 stoichiometry of protein/substrate (Figure 1). Signals of fluorescent protein (corrected by the amount of loaded protein by Coomassie staining analysis) obtained at different times are plotted, and a second order reaction rate is determined . This method can be applied to all AGTs that bind O6-BG, with the exception of the E. coli Ada-C .
Furthermore, an alkylated double strand DNA (dsDNA) oligonucleotide can be included in a competition assay with the fluorescein substrate. This non-fluorescent substrate lowers the final fluorescent signal on gel imaging analysis, depending on its concentration. In this way, it is possible to measure the activity of AGTs for their natural substrate, giving an indirect measure of methylation repair efficiency (Figure 3) . By using this methodology, it was even possible to discriminate the SsOGT activity regarding the position of the O6-MG on DNA (see below; ), in line with previous data on hMGMT .
The recombinant SsOGT protein, heterologously expressed in E. coli, has been fully characterized using the fluorescent assay described and some results are compiled in Table 1. In agreement with its origin, the protein showed optimal catalytic activity at 80 °C, although retaining a residual activity at lower temperatures (Table 1), and in a pH range between 5.0 and 8.0. As for the most part of many thermophilic enzymes, SsOGT is resistant over a wide range of reaction conditions, such as ionic strength, organic solvents, common denaturing agents, and proteases . Interestingly, chelating agents do not affect the activity of this enzyme. Crystallographic data clarified this observation, as the archaeal enzyme lacks a zinc ion in the structure , whereas this ion is important for correct folding of hMGMT .
Table 1. Biochemical properties comparison among SNAP-tag, SsOGT and the relative H5 mutant.
All catalytic steps of the AGTs’ activity (alkylated DNA recognition, DNA repair, irreversible trans-alkylation of the catalytic cysteine, recognition, and degradation of the alkylated protein) have been structurally characterized. Most information comes from the classic studies on hMGMT, as well as the Ada-C and OGT from Escherichia coli . Other AGTs’ structures are also available in the Protein Data Bank site (http://www.rcsb.org/pdb/results/results.do?tabtoshow=Current&qrid=D3B02F3B).
All AGTs are inactivated after the reaction and degraded via proteasome, whereas in higher organisms, the degradation is preceded by protein ubiquitination . It is a common view that the recognition of alkylated-AGTs is due by a conformational change; however, data on structure and properties of alkylated AGTs are limited because alkylation greatly destabilizes their folding . The methylated-hMGMT and benzylated-hMGMT 3D structures were only obtained by flash-frozen crystals, showing that alkylation of the catalytic cysteine (C145) induces subtle conformational changes . Consequently, these structures might not reflect the physiological conformation of the alkylated hMGMT .
Concerning the interactions with the DNA, SsOGT binds methylated oligonucleotides. However, the repair activity depends on the position of the alkyl-group . To efficiently repair the alkylated base on DNA double helix, the protein requires at least three bases from either the 5′ or the 3′ end. This is due to the necessary interactions formed with the double helix. Structural analysis confirmed these data .
To overcome the serious limitation to obtain structural data from mesophilic AGTs after reaction, studies have moved to thermostable homologues, based also on the knowledge that all AGTs share a common CTD domain structure. In contrast to the human counterpart, alkylated SsOGT was soluble and relatively stable, thus allowing in-deep analysis of the protein in its post-reaction form . Structural and biochemical analysis of the archaeal OGT, as well as after the reaction with a bulkier adduct in the active site (benzyl-fluorescein; ), suggested a possible mechanism of alkylation-induced SsOGT unfolding and degradation (Figure 2).
On the basis of their data, Perugino and co-workers suggested a general model for the mechanism of post-reaction AGT destabilization—the so called active-site loop moves towards the bulk solvent as a result of the covalent binding of alkyl adduct on the catalytic cysteine and the extent of the loop movement and dynamic correlates with the steric hindrance of the adduct  (Figure 2). The destabilization of this protein region triggers then the recognition of the alkylated protein by degradation pathway.
The introduction of the SNAP-tag technology enabled a wide in vivo and in vitro labelling variety for biological studies by fusing any protein of interest (POI) to this protein tag . However, being originated from hMGMT, the extension to extremophilic organisms and/or harsh reaction conditions is seriously limited.
By following the same approach used for the hMGMT as Kai Johnsson , an engineered version of SsOGT was produced . This protein, called SsOGT-H5, contains five mutations in the helix-turn-helix domain, abolishing any DNA-binding activity . In addition, a sixth mutation was made—in the active site loop, where serine residue was replaced by a glutamic acid at position 132 (S132E). This modification increased the catalytic activity of SsOGT , as it was observed in the engineered version of the hMGMT during the SNAP-tag development . SsOGT-H5 shows slightly lower heat stability in respect to the wild-type protein (Table 3), whereas the resistance to other denaturing agents is maintained. Moreover, SsOGT-H5 is characterised by a surprisingly high catalytic activity at lower temperatures, keeping the rate of reaction to the physiological ones (Table 3) . These characteristics make this mutant a potential alternative to SNAP-tag for in vivo and in vitro biotechnological applications. The stability against thermal denaturation allowed Miggiano and co-workers to obtain the structure of the protein after the reaction with the fluorescent substrate SNAP-Vista Green, revealing the peculiar destabilization of the active site loop after the alkylation of the active cysteine .
The Saccharolobus OGT mutant has been firstly tested as protein tag fused to two thermostable S. solfataricus proteins heterologously expressed in E. coli. The chimeric proteins were correctly folded, and the tag did not interfere with the enzymatic activity of the tetrameric S. solfataricus β-glycosidase (Ssβgly) , nor with the hyperthermophile-specific DNA topoisomerase reverse gyrase . Furthermore, the stability of H5 made possible a heat treatment of the cell-free extract to remove most of the E. coli proteins and performing the β-glycosidase assay at high temperatures without the need of removing the tag .
As the applicability of the thermostable tag under in vivo conditions is very important, the SsOGT-H5 was also expressed in thermophilic organisms. The fluorescent AGT assay allows for the detection of the presence of SsOGT-H5 both in living cells as well as in vitro in cell-free extracts . To assay the activity to SsOGT-H5, it was necessary to choose models in which the endogenous AGT activity is suppressed. Thermus thermophilus is an ogt- species, showing only one agt homologue (TTHA1564), whose annotation corresponds to an alkyltransferase-like protein (ATL) . ATLs are a class of proteins present in prokaryotes and lower eukaryotes , presenting aminoacidic motifs similar to those of AGTs’ CTD, in which a tryptophan residue replaces the cysteine in the active site . Like AGTs, ATLs use a helix-turn-helix motif to bind the minor groove of the DNA, but they do not repair it as they only recruit and interact with proteins involved in the nucleotide excision repair system .
Although T. thermophilus is a natural ogt knockout organism, Sulfolobus islandicus possesses an ogt gene very similar to that of S. solfataricus, which was silenced by a Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-based technique and then used as a host organism .
The fluorescent signal obtained by SDS-PAGE gel imaging revealed that SsOGT-H5 not only is efficiently expressed in these thermophilic microorganisms, but it also showed that this tag was correctly folded and active, demonstrating the fact that SsOGT-H5 might be used as an in vivo protein tag at high temperatures . As is the case with SNAP-tag in human cells, the utilization of SsOGT-H5 with different fluorescent substrates gives the opportunity to perform a multi-colour fluorescence study, by following a POI inside living “thermo cells” at different stages and localization.
As most biotechnological processes require harsh operational conditions, the immobilization of very robust enzymes on solid supports is often essential . By definition, an immobilized enzyme is a “physically confined biocatalyst, which retains its catalytic activity and can be used repeatedly” . Protein immobilisation offers several advantages, such as the catalysts’ recovery and reuse, as well as the physical separation of the enzymes from the reaction mixture. Currently, different immobilisation strategies are available, from physical adsorption to covalent coupling . However, all these procedures require purified biocatalysts and suffer from problems related to steric hindrance between the catalyst, the substrate, and the solid support, with increasing of costs and time for the production processes.
The introduction of “cell-based” immobilisation systems resulted in a significant improvement and reduces both time and costs of the process. One of the most widely used display strategies is the simultaneous heterologous expression of enzymes and their in vivo immobilisation on the external surface of Gram-negative bacteria cells, by the utilisation of the ice nucleation protein (INP) from Pseudomonas syringae . Most recently, the N-terminal domain of INP (INPN) was used to produce a novel anchoring and self-labelling protein tag (hereinafter ASLtag). The ASLtag consists of two moieties, the INPN and the engineered and SsOGT-H5 mutant (Figure 3) .
Figure 3. The novel anchoring and self-labelling protein tag (ASLtag) system. A protein of interest (POI) is genetically encoded with the tag, which in turn makes it anchored in the outer membrane and accessible for the covalent linkage to a desired chemical group (magenta sphere) by the activity of SsOGT-H5 (adapted from ).
The INPN allows an in vivo immobilisation on E. coli outer membrane of enzymes of interest and their exposition to the solvent. The significant reduction of the costs related to the purification and immobilization is added to the overcoming of problems related to the recovery of enzymes by simple filtration or centrifugation methods . SsOGT-H5, in turn, gives the unique opportunity to label immobilized enzymes with any desired chemical groups (opportunely conjugated to the benzyl-guanine; in magenta in Figure 3) , dramatically expanding biotechnological applications of this new tool. Depending of the chemical group of choice, modulating the activity of enzymes fused with the ASLtag can be possible by introducing activator or inhibitor molecules (Figure 5). The ASLtag system was successfully employed for the expression and immobilization of monomeric biocatalysts, such as the thermostable carbonic anhydrase from Sulfurihydrogenibium yellowstonense (SspCA), as well as the tetrameric Ssβgly, without affecting their folding and catalytic activity . Moreover, SspCA fused to the ASLtag showed an increase in residual activity of up to 30 % for a period of 10 days at 70 °C , representing a huge advantage in pushing beyond reactions in bioreactors and in the reutilization of biocatalysts.