1. Thermophilic and Hyperthermophilic Nucleic Acid Polymerases
Generally speaking, extensively investigated and applied thermophilic and hyperthermophilic nucleic acid polymerases mainly include thermostable DNA polymerases (DNAPs) and minority RNA polymerases (RNAPs) mostly from
Thermus, Thermococcus,
and Pyrococcus [1][2][3] . From the perspective of family classifications, the DNAPs among these enzymes are concentrated in family A and B (
Figure 1).
Figure 1. Phylogenetic tree of representative thermophilic DNAPs. Green: family A DNAPs; pale yellow: family B DNAPs.
Taq DNAP from the thermophilic bacterium
Thermus aquaticus was the first isolated thermophilic DNAP
[4], which led to a breakthrough in PCR technology by eliminating the addition of a new enzyme after each cycle of thermocycling
[5][6]. The optimal temperature of Taq DNAP is 75 to 80 °C, which is much higher compared with DNAPs from organisms living in regular environments. However, its half-lives are 45 to 50 min at 95 °C and 9 min at 97.5 °C, which are relatively short
[7]. Taq DNAP has been classified as family A. It has 5′-3′ exonuclease activity and no 3′-5′ exonuclease activity, so its fidelity is not good compared to polymerases that own 3′-5′ exonuclease activity
[8]. Under optimized conditions, the error rate of Taq DNAP was tested to be about 1.2 × 10
−5 to 3.3 × 10
−6 mf × bp
−1 × d
−1 (mutation frequency per base pair per duplication)
[9]. Based on the real-time polymerase chain reaction (PCR) experiments, amplification efficiencies of Taq DNAP were found to be around 80% for targets shorter than 1 kb and around 60% for 2.6 kb targets with a CG content between 45 to 56%
[10]. Efforts have been made to alter the properties of Taq DNAP to improve its performance for different applications. For example, the pH of the reaction buffer and MgCl
2 concentration have been optimized to improve its fidelity
[9]. Deletion of proper regions in the 5′-3′ exonuclease domain has proven effective in improving the fidelity or thermostability of Taq DNAP. KlenTaq, a truncated variant of Taq DNAP lacking the N-terminal 235 amino acids, has been reported to have a two-fold higher fidelity than that of Taq DNAP
[11]. A similar variant, the Stoffel fragment (SF), which is deficient in the N-terminal 289 amino acids, was found to have an increased thermostability
[7]. Besides the DNA amplification activity, Taq DNAP also demonstrated some extent of RNA RT activity
[12]. With optimized conditions, Taq DNAP could facilitate one-enzyme reverse transcription-qPCR of viral RNA
[13].
Similar A-family thermophilic DNAPs have been identified from other
Thermus strains, such as Tfi DNAP from
Thermus filiformis [14], Tth DNAP from
Thermus thermophilus [15], Tfl DNAP from
Thermus flavus [16], Tca DNAP from
Thermus caldophilus [17], and TsK1 DNAP from
Thermus scotoductus [18]. Like Taq DNAP, these DNAPs possess 5′-3′ exonuclease activity but no 3′-5′ exonuclease activity. The PCR performances, such as amplification efficiency, fidelity, and specificity, and reaction conditions of Tfi DNAP, are similar to those of Taq DNAP
[19]. Removing the 5′-3′ exonuclease domain of Tfi DNAP did not significantly affect the enzyme activity and stability
[20]. However, a comparative study of exo
–/exo
+ Tfi DNAP blends with different blending ratios exhibited that raising the proportion of exo
– Tfi mutant led to an increase in the PCR amplification yield for the target product
[21]. Tth DNAP from
Thermus thermophilus HB8 also showed structural and functional similarities with Taq DNAP
[22]. In addition, under similar conditions in the presence of Mn
2+, Tth DNAP performed higher reverse transcription activity than Taq DNAP, which is useful for one-pot reverse transcription and PCR amplification of low-level RNA
[23]. Sso7d is a small protein isolated from hyperthermophilic archaebacteria
Saccharolobus solfataricus and may play a role of stabilizing the genomic DNA in the cell
[24][25]. It has great thermostability and is able to bind with dsDNAs without much sequence preference
[26]. An early study found that the fusion of this protein with several DNAPs significantly enhanced their processivity
[27]. Recently, Sso7d protein was also fused to the N-terminal of Tth DNAP, which might improve the DNA binding capacity and processivity of Tth DNAP without affecting its catalytic activity and stability
[22].
Though with a high sequence homology with Taq DNAP, some family A DNAPs from other
Thermus strains exhibit some different characteristics. For example, Tfl DNAP from
Thermus flavus demonstrated a higher thermostability and maintained PCR activity after heat treatment at 94 °C for 60 min, while Taq DNAP lost activity within 30 min under the same temperature
[28]. Furthermore, Tfl and Tth DNAPs can significantly eliminate negative influences from the inhibitors of PCR reaction in the intraocular fluids and blood, avoiding false-negative results
[29][30]. For another example, Tca DNAP exhibited longer half-lives in the presence of gelatin and a narrower working pH range than that of Taq DNAP
[17][31]. Recently, a novel A-family DNAP, TsK1 DNAP, was reported to have a comparable half-life to rTaq (a commercially available recombinant Taq DNAP), which is shorter than that of Taq DNAP
[18]. However, this enzyme demonstrated high amplification efficiency and better fidelity than Taq DNAP, making it a potential tool for molecular biology methodologies.
Many other family A DNAPs have been isolated from
Bacillus species
[32], such as Bst DNAP from
Bacillus stearothermophilus [33] (now categorized as
Geobacillus stearothermophilus [34]), Bca DNAP from
Bacillus caldotenax [35], Bcav DNAP from
Bacillus caldovelox [36], Bsm DNAP from
Bacillus smithii [37], and Gss DNAP from
Geobacillus sp. 777
[38]. The optimal temperatures of these DNAPs are 60 to 70 °C, which are lower than those of the thermostable polymerases from
Thermus species introduced above. Bst-like DNAPs are widely used in isothermal amplification techniques, such as loop-mediated isothermal amplification (LAMP) and whole genome amplification (WGA), due to their strong strand displacement activity
[39][40].
Hyperthermophilic microorganisms are bacteria or archaea whose optimal temperature for growth exceeds 80 °C
[41].
Thermotoga,
Thermosipho,
Aquifex, and
Thermocrinis are common genera of hyperthermophilic bacteria
[42]. Tma DNAP isolated from
Thermotoga maritima is a 97 kDa A-family polymerase with inherent 3′-5′ proofreading activity and 5′-3′ exonuclease activity
[43]. Tma DNAP exhibited activity over a wide range of temperatures from 45 to 90 °C, with the optimal temperature being 75 to 80 °C. N-terminal truncation of Tma DNAP yielded UlTma (Perkin-Elmer) with enhanced thermostability
[44]. The presence of 3′-5′ proofreading activity does not confer a high level of fidelity to UlTma, as implied by a similar replication accuracy with that of Taq DNAP
[45]. A similar polymerase, Tne DNAP, has been isolated from
Thermotoga neapolitana [46]. Later research found that mutations in the O-helix region improved the fidelity of this polymerase
[47]. A mixture of KlenTaq and Tne DNAPs has also been prepared and found useful for the efficient amplification of long DNA fragments
[48]. Aae DNAP isolated from
Aquifex aeolicus is another family A DNAP, possessing 5′-3′ polymerase activity and 3′-5′ proofreading activity but no 5′-3′ exonuclease activity
[49]. Half-lives of Aae DNAP, in the presence of BSA, were 6 h and 1.7 h at 75 and 85 °C, respectively. Although
Aquifex aeolicus can grow at nearly 95 °C, the activity of Aae DNAP decreased rapidly at temperatures over 90 °C.
Family B DNAPs from hyperthermophilic archaea have been widely used in PCR due to their good thermostability and 3′-5′ proofreading activity
[50][51]. Several thermostable DNAPs have been isolated from the genera
Thermococcus and
Pyrococcus, characterized, and commercialized. Tli (Vent) DNAP from
Thermococcus litoralis is an archaeal DNAP, with a molecular weight of 89 kDa
[52]. It is also the first reported thermostable DNAP possessing proofreading activity, which demonstrated a 2–4 times lower error rate compared to the proofreading activity-free enzyme Replinase DNAP (isolated from
Thermus flavis)
[52]. Tli DNAP is extremely thermostable, having a half-life of 2 h at 100 °C, and can be used for high-temperature DNA synthesis. In addition, Tli DNAP is resistant to hemoglobin inhibition, making it suitable for PCR amplification of DNAs in blood samples
[29]. KOD DNAP is another commercial high-fidelity B-family polymerase possessing a 3′-5′ exonuclease domain and was isolated from
Thermococcus kodakaraensis [50] (formerly
Pyrococcus sp. KOD1
[53]). KOD DNAP has a higher thermostability than most DNAPs, and its half-life at 95 °C reaches 12 h. It has also been reported to have an extension rate of 6.0–7.8 kb/min and an error rate of 2.6 × 10
−6, allowing efficient and faithful amplification of DNA in PCR reaction. PCR technique based on KOD DNAP was further developed for accurate amplification of long DNAs
[54]. With a mixture of wild-type KOD DNAP and its exo
– variant (N210D), i.e., KOD Dash polymerase, long DNA fragments (up to 15 kb) were accurately amplified. 9°N DNAP, isolated from
Thermococcus sp. 9°N-7, has a similar temperature-sensitive strand displacement activity and K
m values with Tli DNAP
[55]. Tgo DNAP, isolated from
Thermococcus gorganarius, is another widely engineered polymerase for XNA synthesis
[56]. Besides these DNAPs introduced above, many other B-family polymerases have also been isolated from
Thermococcus species and characterized, including Tfu DNAP from
Thermococcus fumicolans [57], TNA1 DNAP from
Thermococcus sp. NA1
[58], Tpe DNAP from
Thermococcus peptonophilus [59], Tzi DNAP from
Thermococcus zilligii [60], and Twa DNAP from
Thermococcus waiotapuensis [61].
Pfu DNAP is one of the most representative family B DNAPs isolated from hyperthermophilic marine archaea
Pyrococcus furiosus [51]. Pfu DNAP has a high fidelity under optimized buffer and substrate concentrations
[62]. The error rate of Pfu DNAP was found to be 1.38 × 10
−6. This high fidelity is mainly attributed to the 3′-5′ exonuclease activity, and an exo
– mutant of Pfu DNAP demonstrated significantly decreased fidelity
[62]. The extension rate of Pfu DNAP is only 0.5–1.5 kb/min, which is lower than that of most other DNAPs
[2]. The fusion of the Sso7d protein to the N-terminal of Pfu DNAP improved its processivity but did not affect its catalytic activity and stability
[27]. It was also found that Pfu DNAP has weak incorporation activity of dUTP, which reduced the PCR efficiency when dUTP was used in PCR reaction
[63].
Pst DNAP, also known as Deep Vent DNAP, is another well-studied thermophilic archaeal DNAP. It was isolated from
Pyrococcus strain GB-D, which can grow at 104 °C
[64]. Pst DNAP possesses a high fidelity with an error rate of 2.7 × 10
−6, better than that of Vent or Taq DNAP
[62]. Like other B-family polymerases, Pst DNAP has a high 3′-5′ proofreading activity that decreases errors during the DNA replication process. Deletion of the 3′-5′ exonuclease activity also significantly reduced its fidelity
[65].
Several other thermophilic DNAPs that are not as famous as Pfu and Deep Vent DNAPs have been isolated from other
Pyrococcus species. For example, Pab DNAP, which is also a B-family DNAP, was isolated from
Pyrococcus abyssi, an archaeon growing in hyperthermal environments in the deep sea
[66]. Pab DNAP has a higher thermostability than Taq and Pfu DNAPs and retains 75% of its activity after being incubated at 100 °C for 5 h
[67]. Pab DNAP also has 3′-5′ exonuclease activity that confers proofreading ability and high fidelity to it
[67]. Another example is Pwo DNAP from
Pyrococcus woesei [68]. This DNAP has a molecular weight of 90 kDa, and also possesses 3′-5′ exonuclease proofreading function like other B-family DNAPs
[69]. For the highest activity, Pwo DNAP needs a slightly more alkaline buffer condition, which may lead to the degradation of dNTPs, and thus dNTPs should be added right before the addition of Pwo DNAP when preparing the PCR solution
[70]. Besides, the 3′-5′ exonuclease activity of this polymerase can lead to the degradation of primers and PCR products when the concentrations of dNTPs are low, and nuclease-resistant phosphorothionate protected primers can be used to solve this problem
[2].
The phylogenetic tree of these polymerases shown in Figure 1 exhibits their evolutionary relationships. Although some of these DNAPs are neither well known nor commercialized, their identifications have expanded the repertoire of thermophilic DNAPs, providing more candidates to be explored and engineered for various potential applications.
Although natural thermophilic DNAPs are very efficient for DNA synthesis, and thus, have been broadly used in biotechnology, their activities of XNA synthesis are usually relatively poor, which severely limits their applications in xenobiology. In order to get efficient polymerases for the synthesis, reverse transcription, and even amplification or inter-transcription of XNAs (Figure 2), natural polymerases have to be engineered with various protein engineering strategies.
Figure 2. Expansion of the central dogma with XNAs and XNAPs. Green arrows: replication; blue arrows: transcription or reverse transcription.
2. Strategies for Engineering Thermophilic Nucleic Acid Polymerases
The engineering of polymerases can be carried out via directed evolution, rational design, or semi-rational design
[1][71][72][73]. Directed evolution mimics Darwinian evolution in nature, and yet with significantly shortened evolution time for desired phenotypic traits
[74]. Random mutagenesis and/or recombination are carried out on the target polymerase genes with a much higher frequency than that of spontaneous mutagenesis or recombination in nature, followed by the selection or screening of desired mutants under artificial pressures. For polymerases with more structural information, rational or semi-rational approaches can be used to predict candidate residues or regions for mutagenesis, reducing the size of the polymerase library and the labor intensity for subsequent selection or screening
[75].
2.1. Strategies for Mutant Generation or Library Construction
Error-prone PCR and DNA shuffling are two methods that are most extensively employed to randomize the gene of a target protein (
Figure 3). Error-prone PCR is derived from standard PCR reaction, and yet polymerases of low fidelity and altered reaction conditions, including unbalanced concentrations of dNTPs and the addition of manganese ion are applied to increase the mutation rate of the target gene during amplification
[76]. DNA shuffling provides a method to recombine homologous gene sequences, which is similar to natural homologous recombination but much more efficient
[77]. Many strategies have been developed for DNA shuffling, including DNase I fragmentation and reassembly, staggered extension process (StEP), and synthetic shuffling. Traditional DNA shuffling involves DNase I digestion of a pool of homologous genes and subsequent reassembly of fragments by PCR
[78]. Instead of random fragmentation and assembly, StEP uses the target genes as templates to create a recombinant library through multiple rounds of shortened polymerase-catalyzed extension
[79]. In some cases, the addition of specific synthetic oligonucleotides during DNA shuffling can make the libraries more directional for studying the function of interest
[80]. DNA shuffling usually requires high-sequence homology. For parental genes with insufficient homology, it may be a feasible method to optimize the shuffling template sequences through computer programs to improve the homology
[81][82].
Figure 3. Representative methods employed for the generation of XNAPs. (a) Methods for the construction of polymerase libraries. I: error-prone PCR; II: site-directed saturation mutagenesis; III: gene shuffling by StEP; IV: gene shuffling by DNase I digestion and PCR reassembly; (b) methods for the selection of polymerase mutants. I: phage display; II: CSR; III: CST.
Rapid developments in sequencing, structure determination, and computational tools pave the way to the rational design of proteins. Through collection and analysis of existing sequence/structure-function data, candidate mutations of a protein for desired properties can be predicted. Site-directed mutagenesis is then carried out to generate target mutants or focused libraries. With a deeper understanding of sequence–structure-function relationships, even de novo design of proteins can be accomplished
[83][84]. In past decades, many algorithms have been developed to facilitate structure prediction, design, and engineering of proteins, such as FoldX
[85], Rosetta
[86], I-Mutant
[87], FRESCO
[88], PROSS
[89], and UniRep
[90]. As an example of applying rational protein engineering approaches on polymerases, four Bst DNAP variants with enhanced thermostability have recently been obtained through MutCompute, an unsupervised machine learning algorithm
[91].
Although having dramatically diminished the time and labor involved in the selection or screening of protein mutants, rational design requires extensive and in-depth data of sequence/structure-function relationships to improve the accuracy, which is not available for many proteins
[92]. Combining the advantages of both directed evolution and rational design, semi-rational design has proven to be an effective tool for protein engineering. A small number of promising residues are identified based on computational simulation and analysis, leading to the construction of smaller but high-quality libraries and more efficient evolution processes. Various semi-rational approaches for protein engineering have been developed, including structure-based combinatorial protein engineering (SCOPE)
[93], combinatorial active-site saturation test (CAST)
[94], iterative saturation mutation (ISM)
[95], sequence saturation mutagenesis (SeSaM)
[96], protein sequence activity relationship algorithm (ProSAR)
[97], and reconstructing evolutionary adaptive paths (REAP)
[98]. Some of these approaches have been successfully used for the engineering of many thermophilic DNAPs, such as Bst DNAP and Taq DNAP
[91][99].
2.2. Strategies for the Selection or Screening of Polymerase Libraries
Directed evolution is a powerful tool in the development of polymerases, in which the critical step is to build a high-throughput selection or screening method for the enrichment of active mutants. Selection or screening strategies for protein mutants are usually designed based on the binding of the proteins and their ligands, visualization of the catalytic activities of the enzymes, selective amplification of the target genes, or viability of the organisms
[100]. The core for the selection or screening of a library is to connect the phenotypes of the mutants with their genotypes. At present, broadly used methods for selecting or screening polymerase mutants with unnatural activities mainly include methods based on the phage system, in vitro compartmentalization system, and multi-well plate system
[71].
Based on the phage display technique, Romesberg and co-workers developed a polymerase selection system
[99]. In this system, the polymerase library was co-displayed with the primer/template substrate on M13 phage particles, and successful extension of the primer with unnatural nucleoside triphosphates led to biotin labeling of the 3′-end of the primer, allowing the separation of active polymerase mutants from the library with streptavidin-coated beads. In another example, Liu and co-workers designed a phage-assisted continuous evolution (PACE) system, which could be used for iterative rounds of protein evolution without human intervention continuously
[101]. This system correlated the desired activity of the target protein with the infectivity of the M13 phage and, thus, realized the rapid evolution of the protein along with the phage propagation.
In vitro compartmentalization is another strategy to build a linkage between genotypes and their corresponding phenotypes and has been broadly used in protein evolution. Some two decades ago, Tawfik and Griffiths developed “man-made cell-like compartments” using water-in-oil emulsions to generate separated micro-reactors, allowing the isolation of independent reactions and selection of promising protein variants
[102]. Since then, emulsion-based compartmentalization has also been extensively applied to build polymerase evolution systems. Holliger and co-workers designed the compartmentalized self-replication (CSR) method based on microemulsion, in which polymerase variants were individually packaged into compartments of water-in-oil emulsion, together with PCR primers and nucleoside triphosphate substrates
[103]. In this way, the genes of active variants were replicated by the polymerases that they encoded during thermocycling and enriched in the gene pool. Later, various derivative methods of CSR have been developed, including short-patch compartmentalized self-replication (spCSR)
[104], reverse transcription-compartmentalized self-replication (RT-CSR)
[105][106], compartmentalized partnered replication (CPR)
[107], and high-temperature isothermal compartmentalized self-replication (HTI-CSR)
[82].
To directly select or screen for hard-to-evolve polymerase mutants with XNA synthesis or reverse transcription activities, several other in vitro compartmentalization-based selection or screening strategies that do not rely on self-replication of the polymerase gene have been developed. For example, the compartmentalized self-tagging (CST) method was designed to select polymerases capable of XNA synthesis
[108]. Similar to CSR, the polymerase pool was also compartmentalized with primers and nucleoside triphosphate substrates in water-in-oil emulsions to ensure the separation of individual variants and genotype–phenotype association. Different from CSR, CST was based on the tagging of a polymerase-encoding plasmid by extension of a short biotinylated primer when the polymerase had desired activity. Subsequent bead separation of the tagged plasmid allowed the enrichment of active polymerase variants. Later, to select for XNA RTs, Holliger and co-workers developed compartmentalized bead labeling (CBL), which relied on bead co-immobilization of the polymerase-encoding plasmids and primer/template complex for reverse transcription and subsequent fluorescent screening of the beads harboring desired variants
[109].
To achieve a more controllable in vitro compartmentalization, microfluidic systems can be used to generate predefined compartments
[110]. Recently, the Chaput group developed droplet-based optical polymerase sorting (DrOPS) method relying on microfluidic technology and cell sorting, in which polymerase variants were encapsulated with optical sensors for monitoring polymerase activity
[111]. Successful extension of the primer by polymerase mutants led to the generation of fluorescence, and then, the water-in-oil-in-water or water-in-oil droplets were sorted by fluorescence-activated cell sorting (FACS) or fluorescence-activated droplet sorting (FADS)
[112][113].
Although multi-well plate screening methods do not have throughputs as high as those of methods introduced above, they are still broadly used in the identification of polymerase variants from focused libraries with smaller size or from libraries pre-enriched with the methods introduced above
[99][104][108][109]. In a typical multi-well plate screening method for polymerase mutants, the polymerase-mediated primer extension is correlated with the generation of colored or fluorescent products from enzymatic reactions, which can be directly monitored with a plate reader.
3. Thermophilic XNAPs
Thermophilic nucleic acid polymerases and their mutants have been extensively explored and used in the synthesis, reverse transcription, and even amplification of XNAs
[1][114]. Although some other polymerases that are not high temperature tolerant, such as mutants of T7 RNAP, have also been used for the synthesis of modified nucleic acids
[115], thermophilic nucleic acid polymerases are indispensable for XNA synthesis, reverse transcription, or amplification at high temperatures or with thermocycling programs, which are essential when tough templates with complex secondary structures are used or are important for higher yields.
Some thermophilic DNAPs, such as Taq, KlenTaq, Tth, KOD (exo
–), KOD Dash, Vent (exo
–), Pwo, Pfu, and Tgo DNAPs, demonstrate good tolerance to modifications on nucleobases
[116][117][118][119][120][121] but are less tolerant to sugar modifications. However, syntheses or reverse transcriptions of different sugar-modified XNAs with limited lengths by certain polymerases have been reported, although not very efficient. For example, Taq DNAP was reported to be capable of reverse transcription or replication of hexose nucleic acid (HNA) with the length of a few nucleotides
[122]. Bst DNAP has proven capable of reverse transcribing 2′-fluoro-arabino nucleic acid (FANA), α-L-threofuranosyl nucleic acid (TNA), and glycerol nucleic acid (GNA)
[123][124]. Transcription or replication of short stretches of cyclohexenyl nucleic acid (CeNA) was demonstrated with Vent (exo
–) DNAP, and reverse transcription of short CeNA was realized with Taq DNAP or Vent (exo
–) DNAP
[125]. Deep Vent (exo
–) is able to reverse transcribe short stretches of TNA templates into DNA, and effectively incorporate all four 2′-deoxy-2′-fluoro-β-D-arabinonucleoside 5′-triphosphates (2′-F-araNTPs) on a DNA template to yield full-length FANA products
[126][127]. In general, most natural polymerases show relatively narrow substrate specificities and limited activities towards XNAs. This is likely due to the fact that, in nature, to play their respective roles, polymerases have to possess stringent substrate specificities to accurately discriminate the sugars (deoxyribose and ribose) in their substrates, so that they can use the correct template (DNA or RNA) and the correct nucleoside triphosphates (dNTPs or NTPs) for the synthesis of their target products
[128]. The introduction of unnatural sugars into the nucleic acids also usually leads to a great change in their structures, which may contribute to their difficult recognition by the natural polymerases as well
[129]. To overcome the stringent substrate specificity and increase the activity towards unnatural substrates, natural polymerases have to be engineered.
Some commercial polymerase mutants demonstrate enhanced synthesis efficiency for modified nucleic acids. For example, Therminator DNAP, a variant of 9°N (exo
–) DNAP, can incorporate various modified nucleotides
[130][131][132] and even efficiently and faithfully synthesize long TNA oligonucleotides from DNA templates
[132]. However, to realize efficient synthesis or reverse transcription of most of the fully substituted XNAs, the polymerases have to be further engineered via the directed evolution, rational design, or semi-rational design approaches summarized above.
Among thermophilic family A DNAPs, Taq DNAP and its truncated mutants, including SF and KlenTaq, are the most explored and engineered ones for expanded substrate repertoires. With a phage-display-based polymerase selection system, Romesberg and co-workers evolved an SF mutant, SFM19, that can incorporate 2′-O-methyl ribonucleoside triphosphates (2′-OMe-NTPs) on a DNA template
[133]. Later, they further optimized the selection method and used SFM19 as the evolutionary starting point to evolve a series of polymerases, including SFM4-3, SFM4-6, and SFM4-9, that could transcribe or reverse transcribe fully 2′-OMe-modified oligonucleotides, or even PCR amplify partially 2′-OMe- or 2′-F-modified DNAs
[99]. Further investigation demonstrated that these mutants could also synthesize or amplify other sugar-modified nucleic acids, including 2′-chloro (2′-Cl), 2′-amino (2′-Am), 2′-azido (2′-Az), and arabino-modified DNAs, and 2′-OMe- and 2′-F-modified RNAs
[134][135][136]. Holliger and co-workers developed spCSR to select for variants of Taq DNAP, and obtained a mutant, AA40, with the ability to incorporate NTPs and sugar-modified nucleoside triphosphates
[104].
Several thermophilic family B DNAPs have also been extensively engineered for the efficient synthesis and reverse transcription of various XNAs. For example, Holliger and co-workers used the CST method to select the libraries of TgoT DNAP (a Tgo DNAP mutant containing mutations V93Q, D141A, E143A, and A485L), and a mutant with HNA polymerase activity, Pol6G12, was obtained
[108]. They also combined statistical correlation analysis (SCA) with activity screening or CST to develop a series of polymerases capable of synthesizing or reverse transcribing other XNAs, among which PolC7 can efficiently synthesize CeNA and LNA, PolD4K can efficiently synthesize ANA and FANA, RT521 can efficiently synthesize TNA and reverse-transcribe TNA, ANA, and FANA into DNA, while RT521K has good reverse transcription activity for CeNA and LNA. Later, by randomizing the positively charged and bulky residues of mutant RT521, which might lead to a steric clash between the polymerase surface and the P-ethyl-modification on the phosphate backbone, screening the libraries, and performing further site-directed mutagenesis, they successfully obtained mutant PGV2, which demonstrated substantially improved synthesis activity for a newly developed XNA with an uncharged backbone, alkyl phosphonate nucleic acids (phNA)
[137]. Using CBL selection and plate-based screening methods, they further evolved a series of XNA RTs from mutant RT521K
[109]. Among them, RT-TKK can efficiently reverse-transcribe D-altritol nucleic acid (AtNA). RT-C8, which was then evolved from RT-TKK, can efficiently reverse-transcribe 2′-OMe-RNA, and also has some extent of reverse transcription activity for P-α-S-phosphorothioate 2′-methoxyethyl RNA (PS 2′-MOE-RNA). Another derivative of RT-TKK, RT-H4, can reverse transcribe HNA much more efficiently than RT521K and RT-TKK.
Chaput and co-workers identified specificity-determining residues (SDRs) of the polymerase by analyzing the polymerase/DNA complex structure and screened for the beneficial mutations at specificity-determining residues (SDRs) positions in a model polymerase scaffold
[138]. By transferring these mutations to homologous proteins, a series of mutants that demonstrated RNA and TNA synthesis activities were rapidly developed from several family B DNAPs, including 9°N, Tgo, KOD, and Deep Vent DNAPs. They also successfully selected a manganese-independent TNA polymerase, 9n-YRI, from a site-saturation mutagenesis library of 9°N DNAP with the DrOPS method that they developed
[111]. They further combined FADS sorting with deep mutational scanning to provide an unbiased screening of all possible single-point mutations in the finger subdomain of KOD (exo
–) DNAP
[113]. By screening mutants containing combinations of selected mutations, a double mutant, KOD-RS, which can conduct efficient TNA synthesis, was obtained, suggesting that polymerase specificity may be controlled by a small number of highly specific residues and more attention should be paid to these sites when engineering polymerases for the synthesis of specific nucleic acids. They later developed a programmed allelic mutation (PAM) strategy, applied it with DrOPS sorting, and successfully evolved a mutant with enhanced efficiency and specificity for TNA synthesis, Kod-RSGA, from mutant Kod-RS
[139]. Herdewijn and co-workers reported 3′-2′ phosphonomethylthreosyl nucleic acid (tPhoNA or PMT) as a novel genetic material, and carried out stepwise engineering of TgoT DNAP to produce a PMT polymerase
[140]. By introducing mutations that are related to XNA synthesis activity and screening for mutations at key residues based on previously reported mutants, they successfully obtained mutant TgoT-EPFLH, which can efficiently synthesize PMT. They also demonstrated that PMT could be efficiently reverse transcribed into DNA by both TgoT mutant RT521 and KOD mutant K.RT521K. Based on structural analysis, Hoshino et al. developed variants of KOD DNAP for LNA synthesis and reverse transcription, among which KOD-DGLNK can efficiently synthesize LNA from DNA, and KOD-DLK can efficiently reverse transcribe LNA into DNA
[141]. These two mutants also demonstrated transcription or reverse transcription activity for 2′-OMe-RNA, respectively. Recently, Chaput and co-workers systematically compared some of the representative XNAPs obtained in previous works introduced above and demonstrated their diversity in thermostability and activity, specificity, and fidelity for the synthesis or reverse transcription of different nucleic acids, including RNA, FANA, ANA, HNA, TNA, and PMT
[142].
This entry is adapted from the peer-reviewed paper 10.3390/ijms232314969