1. Introduction
Proteins are essential to life processes, playing important roles in transport, catalysis, and regulation. Most proteins are composed of only 20 canonical amino acids (cAAs). While the encoding of cAAs is sufficient for fundamental growth and metabolic functions, proteins require additional chemical groups, such as ketone, aldehyde, azide, amide, nitro, and sulfonate, to perform more complicated and diverse biological functions
[1][2][3]. These extra functional groups are necessary because the 20 cAAs severely limit the types and functional applications of proteins, which are no longer sufficient to meet the research needs of fields such as biological science, chemistry, and medicine
[4][5].
Non-canonical amino acids (ncAAs) are derivatives of cAAs, also known as non-standard amino acids (nsAAs), unnatural amino acids (unAAs or uAAs), or non-proteinogenic amino acids (npAAs). Some examples of ncAAs include L-homoserine, which serves as a precursor for the synthesis of essential amino acids and other chemical products
[6][7]. 5-Aminolevulinic acid is an important precursor for the biosynthesis of heme, porphyrin, chlorophyll, and vitamin B12
[8]. NcAAs contain various functional groups, such as ketyl, alkynyl, azide, nitro, phosphate, and sulfonate, which enable them to modify proteins. So far, more than 200 ncAAs can be inserted proteins using genetic code expansion (GCE) technologies
[9][10], which are useful for the research on protein structures and functions.
Despite the growing diversity and range of applications of ncAAs, their synthesis is still a challenge
[11][12][13]. The traditional chemical synthesis methods are limited by harsh reaction conditions
[14], highly volatile toxic substances
[15][16], environmental pollution and high raw material costs
[17]. Metabolic engineering is an emerging technique that could potentially solve these issues and enable the green and efficient production of ncAAs.
2. Biosynthesis of Non-Canonical Amino Acids
As previously mentioned, the chemical synthesis of ncAAs has significant drawbacks. In addition, ncAAs have high technical barriers to synthesis due to the limitations of key technologies such as the screening and preparation of chemical catalysts, the construction of process routes, and the regulation of catalytic processes
[18]. The development of a green and efficient synthesis method for ncAAs is crucial. Metabolic engineering offers promising solutions for synthesizing ncAAs by understanding the catalytic mechanisms of related enzymes. The key heterologous enzymes can be recombined, modified, and optimized in the engineered microbes. These microbes create the possibility of establishing a production platform for ncAAs. The usage of metabolic engineering offers a promising path toward creating a sustainable and cost-effective method for producing ncAAs.
2.1. 5-Hydroxytryptophan
5-Hydroxytryptophan (5-HTP) is a compound with medicinal value that can be used to treat depression, insomnia, and other diseases. Wang et al. have successfully constructed a recombinant strain to biosynthesize 5-HTP
[19]. The biosynthesis pathway was constructed on two plasmids containing three functional modules, namely substrate L-Trp biosynthesis modules, hydroxylation module, and cofactor regeneration module (
Figure 1). Moreover, the human tryptophan hydroxylase I (TPH I) was introduced into the
E. coli BL21Δ
tnaA strain to hydroxylate L-Trp to produce 5-HTP. By further inserting the tryptophan synthesis pathway into the genome, the yield of 5-HTP in shake-flask fermentation was increased to 1.61 g/L while reducing the accumulation of precursor L-Trp, which was beneficial for the subsequent separation and purification of 5-HTP
[20]. After that, Lin et. al engineered the phenylalanine-4-hydroxylase from
Xanthomonas campestris (
XcP4H) and introduced it into a L-Trp-producing
E. coli strain via a co-factor regeneration pathway. The engineered strain produced 1.2 g/L 5-HTP
[21]. Mora-Villalobos utilized sequence analysis, phylogenetic analysis, and functional differential analysis tools to predict, screen, and design the specific mutations of substrate-specific sites of aromatic amino acid hydroxylase from
Cupriavidus taiwanensis (
CtAAAH). The substrate preference of the
CtAAAH was transferred from L-Phe to L-Trp, enabling the generation of 5-HTP with L-Trp as substrate
[22][23]. These studies indicate that microbial-based metabolic engineering has achieved the production of 5-HTP in a green, efficient, and low-cost manner.
Figure 1. Biosynthetic pathways of some ncAAs derived from glucose. The blue color indicates the cAAs. The pink color indicates the ncAAs. Some key enzymes in the pathways: (1) Tryptophan hydroxylase or aromatic amino acid hydroxylase; (2) PapA, PapB and PapC; (3) Branched-chain amino acid transaminase; (4) Tyrosine phenol-lyase; (5) LeuABCD; (6) IlvCD; (7) Aspartate amino transferase; (8) Aspartokinase; (9) Aspartate-semialdehyde dehydrogenase; (10) Homoserine dehydrogenase; (11) Radical SAM enzyme PylB; (12) ATP-dependent PylC; (13) PylD for oxidation; (14) Glutamate dehydrogenase; (15) Theanine synthetase; (16) Glutamate decarboxylase; (17) Proline-4-hydroxylase (P4H). Abbreviation: 5-HTP: 5-Hydroxy tryptophan; p-AF: p-amino-phenylalanine; L-DOPA: Levodopa (3,4-dihydroxy-L-phenylalanine); L-Nva: L-Norvaline; L-Nle: L-Norleucine; L-Pyl: L-Pyrrolysine; L-Hse: L-Homoserine; GABA: Gamma-aminobutyric acid; t4Hyp: Trans-4-hydroxyproline.
2.2. L-Homoserine
L-Homoserine (L-Hse), also known as 2-amino-4-hydroxybutyric acid, is a valuable platform chemical that has been widely used in various fields, such as medicine, agriculture, cosmetics, and spices. The microbial fermentation method has great potential for the large-scale production of L-Hse. Recent studies have focused on using
E. coli and
Corynebacterium glutamate (
C. glutamate) to achieve high-level production of L-Hse
[24][25][26][27]. For example, Cai et al. enhanced the production of L-Hse using a non-auxotrophic deficient and plasmid-free
E. coli chassis
[28]. They first constructed a
E. coli chassis host strain via the knock-down of the L-Hse degradation pathway
[29]. Then, they optimized the metabolic flux of L-Hse biosynthesis by overexpressing the
ppc,
aspC,
aspA,
thrAfbr, and
lysCfbrcgl (
Figure 1). Additionally, they promoted L-Hse efflux by modifying the transport system, introduced a strategy of synergistic utilization of co-factors to promote the regeneration of NADPH, and coordinated the level of redox co-factors by incorporating a heterologous dehydrogenase. As a result, the engineered
E. coli strain was able to produce 85.29 g/L of L-Hse in a 5-liter fermenter, which was the highest titer of the plasmid-free and non-auxotrophic strains reported to date. Although
E. coli as an amino acid-producing chassis has achieved high-level production of L-Hse, large amounts of by-products such as acetate are also produced during the fermentation process
[24][30]. Moreover,
C. glutamate, which is known for its ability to synthesize useful compounds using cheap feedstock, was used to produce L-Hse. Through overexpressing key kinase genes, disrupting competing and degrading pathways, and promoting the synthetic flux, the engineered
C. glutamate produced 8.8 g/L L-Hse in a shake flask
[26].
2.3. Trans-4-Hydroxyproline
Trans-4-hydroxyproline (
t4Hyp) is a value-added amino acid that has been widely used in medicine, food, and cosmetics, especially in the field of chiral synthetic materials.
t4Hyp is traditionally produced via the acidic hydrolysis of collagen, but the process has some drawbacks, such as low productivity and a complicated process. Metabolic engineering has been used to efficiently construct microbial cell factories of
E. coli or
C. glutamate in order to biosynthesize
t4Hyp
[31][32][33]. The introduction of a heterologous proline 4-hydroxylase from
Alteromonas mediterranea (
AlP
4H) into
E. coli enabled the accumulation of 45.83 g/L
t4Hyp within 36 h in a 5-liter fermenter without the addition of proline
[34]. The knockout of the genes of
putA,
proP,
putP, and
aceA in competing pathways and mutations of ProB to D107N/E143A ProB in order to alleviate the feedback inhibition of L-Pro maximized the production of L-Pro so as to enhance the biosynthesis of
t4Hyp. Subsequently, the enzyme activity of L-Pro hydroxylase was increased using genome mining technology and rational design. Ultimately, the engineered strain produced 54.8 g/L
t4Hyp in 60 h using glycerol and glucose as carbon sources
[33]. The microbial metabolic network is large and complex. Genome modification methods such as gene knockout may lead to slow cell growth, stagnation, or even death, which may not be suitable for blocking some competing pathways. So far, CRISPR interference (CRISPRi) that can reduce the transcription of target genes by up to 1000-fold inhibition without miss effect
[35][36] appeared as an alternative way to downregulate the expression of enzymes, which may be employed to repress the expression of the
putA gene to further increase
t4Hyp production in the future.
2.4. Other Non-Canonical Amino Acids
L-Pyrrolysine (L-Pyl) is the 22nd amino acid that has so far been discovered to insert into proteins
[37]. Krzycki et al. reported that L-Lys is the only precursor of L-Pyl. By providing isotopically labeled L-Lys to methanogenic
Archaea with the
pylTSBCD gene cluster, methylamine methyltransferase with L-Pyl incorporation was obtained via mass spectrometry analysis and purification. Further, the biosynthetic process of the converting two L-Lys molecules into one L-Pyl molecule was revealed
[38]. The
pylBCD genes are used for L-Pyl synthesis with tRNA-independent
[39], while the
pylT gene can produce tRNA
CUA (also called tRNA
Pyl), and the
pylS gene can encode pyrrolysyl-tRNA synthase
[40]. Further, the introduction of
pylTSBCD genes into
E. coli can enable the incorporation of endogenously biosynthesized pyrrolysine into proteins. The L-Pyl production capacity of
E. coli was improved by Ho et al. via rational engineering and the directed evolution of the whole biosynthetic pathway of L-Pyl. They also developed alternating phage-assisted non-continuous evolution (Alt-PANCE), also known as alternating mutagenesis and selective phage growth, to accommodate the toxicity of L-Pyl biosynthetic genes
[41]. The evolutionary pathway enabled a 32-fold increase in pyl-incorporating protein yield compared to the rationally modified pathway. The evolved PylB mutant had a 4.5-fold increase in intracellular levels and a 2.2-fold increase in protease resistance.
Gamma-aminobutyric acid (GABA) with high nutritional value can be produced by lactic acid bacteria
[42][43][44]. The GABA production capacity of
Lactiplantibacillus plantarum was improved by changing the crucial fermentation parameters. The optimization of the inoculum percentage, initial pH, inorganic ions, and nutrients concentration significantly improved the ability of a strain to produce GABA
[45].
Selenium is an essential micronutrient that can be incorporated into the active site of specific selenocysteine proteins in the organism through the form of selenocysteine. Selenium-containing proteins play an important role in the regulation of organisms and can be used as research targets for the treatment of some diseases, including cancer, diabetes, Alzheimer’s disease, mental disorders, cardiovascular diseases, etc.
[46][47][48]. Normally, biosynthetic ncAAs are formed in the cytoplasmic matrix, which is then linked by aaRS to the corresponding tRNA, thereby completing the incorporation into protein. The biosynthetic pathway of selenocysteine is different from that of ordinary ncAAs. Selenocysteine has a homologous tRNA
Sec, but there is no free selenocysteine in the cytoplasmic matrix and no corresponding selenocysteinyl-tRNA synthase. The synthesis of selenocysteine does not begin with the ligation of selenocysteine to homologous tRNA
Sec, but rather the seryl-tRNA synthase first attaches L-Ser to non-homologous tRNA
Sec to form seryl-tRNA
Sec. In bacteria, selenocysteine synthase (SelA) directly acts on seryl-tRNA
Sec and removes hydroxyl group from seryl group to generate an intermediate. The intermediate then receives the activated selenophosphate to eventually form selenocysteinyl-tRNA
Sec. Subsequently, selenocysteinyl-tRNA
Sec is paired with UGA codon to complete the incorporation of selenocysteine into the protein
[49][50][51].
Genome-scale models (GEMs) of metabolism are a new technology composed of the full inventory of metabolic reactions encoded by the genome of an organism, and they have been used to achieve the efficient production of ncAAs
[52]. GEMs can explore trade-offs between the growth rate and production, while computer simulations can be used to analyze metabolic pathways and identify strategies for improving production. For a specific example, the
papBAC gene cluster from
Pseudomonas fluorescens was introduced into
E. coli strain EcNR2 to achieve the production of
p-amino-phenylalanine (
pAF), but there was still a trade-off between
pAF production and the growth rate
[53]. To increase
pAF production, a GEM of
E. coli metabolism with computer design was used to identify metabolic pathways and determine the recombinant strain metabolism. Upregulating the metabolic flux in the chorismate biosynthetic pathway by eliminating feedback inhibition was the most effective strategy for increasing
pAF production
[54].
The construction of efficient microbial cell factories for ncAAs has become popular in recent years. These cell factories are mainly created by reconstructing synthesis pathways, designing and modifying key enzymes, coordinating precursor regulation, knocking out competing bypass pathways, constructing cofactor regeneration systems, and intelligently regulating the fermentation process. So far, only a few biosynthetic pathways of ncAAs have been confirmed
[11]. Advancements in synthetic and computational biology technologies, as well as multidisciplinary collaborations, have begun to shed light on ncAAs biosynthesis. In the future, the precise design of ncAAs biosynthetic pathways may be accomplished using advanced bioinformatics or biosynthesis simulation tools. Additionally, more chassis with high tolerance to specific ncAAs must be engineered or screened to increase the compatibility between microbes and heterologous ncAAs biosynthetic pathways.