Aminoacyl-tRNA synthetases (aaRSs) are a family of essential and universal ‘house-keeping’ enzymes responsible for catalyzing the esterification of amino acids to their cognate tRNAs.
AaRSs catalyze the highly specific aminoacylation reaction in a common two-step reaction [1][2] (Figure 1a). In the first step, the α-carboxylate oxygen of the amino acid attacks the α-phosphate of ATP requiring Mg2+ as a co-factor, forming an aminoacyl-adenylate (aa-AMP) intermediate and releasing the byproduct inorganic pyrophosphate (PPi). In the subsequent reaction, the activated amino acid is transferred to the 2′- or 3′-hydroxyl group of the ribose moiety of the 3′-terminal adenosine of the corresponding tRNA with the release of AMP (Figure 1a). The produced aminoacyl-tRNA (aa-tRNA) is used as a substrate by the ribosome for the de novo synthesis of proteins. Typically, these two steps can be considered independently. However, the arginyl-, glutaminyl-, glutamyl-tRNA synthetases (ArgRS, GlnRS, and GluRS, respectively) and some unusual archaeal lysyl-tRNA synthetases (LysRS) can only catalyze aa-AMP formation in the presence of cognate tRNA [3][4][5][6]. For these four enzymes, their cognate tRNAs play a dual role during catalysis, acting as an activator in the first reaction and as a substrate in the aminoacyl transfer process.
Figure 1. Catalytic mechanisms for the standard (a) and indirect (b) aminoacylation reaction.
In most prokaryotes, tRNA aminoacylation is ensured by 20 different aaRSs, one for each proteinogenic amino acid [1][2]. However, asparaginyl-tRNA synthetase (AsnRS) and glutaminyl-tRNA synthetase (GlnRS), responsible for specific Asn-tRNAAsn and Gln-tRNAGln formation, are missing in certain microorganisms, and in the latter case also in chloroplasts [7]. Their functions are substituted by a non-discriminating aspartyl-tRNA synthetase (AspRSND) and glutamyl-tRNA synthetase (GluRSND), respectively [7]. These enzymes catalyze the formation of mischarged Asp-tRNAAsn and Glu-tRNAGln, which are then converted to Asn-tRNAAsn and Gln-tRNAGln by tRNA-dependent amidotransferases (AdT) [8][9][10] (Figure 1b).
Apart from the 20 canonical aaRSs, two unique aaRSs have been identified in archaea known as phosphoseryl-tRNA synthetase (SepRS) [11] and pyrrolysyl-tRNA synthetase (PylRS) [12] that expand the genetic code for noncanonical amino acids. In some methanogenic archaeal organisms, SepRS ligates phosphoserine onto tRNACys, which is then converted to Cys-tRNACys by the Sep-tRNA:Cys-tRNA synthase [13]. This full aminoacylation reaction is conceptually similar to the aforementioned indirect catalysis of Asn-tRNAAsn and Gln-tRNAGln. In the case of pyrrolysine (Pyl), it is inserted in the protein sequence corresponding to an inframe UAG stop codon by tRNAPyl, which itself is produced by PylRS. Interestingly, PylRS and tRNAPyl function as an orthogonal pair excluding any interactions with other aaRSs and tRNAs. Based on this finding, an enzyme engineering technique is widely used to generate PylRS variants aiming for incorporation of noncanonical amino acids into recombinant proteins [14][15]. In addition, the technique has recently been further improved and multiple chimeric tRNA synthetase/chimeric tRNA pairs were designed, allowing orthogonal and highly efficient incorporation of a large variety of aromatic and fluorescent amino acids into a biological produced product [16].
Compared to prokaryotes, humans encode 36 distinct tRNA synthetases: 16 cytoplasmic (including the bifunctional glutamyl-prolyl tRNA synthetase (EPRS) in charge of the aminoacylation of Glu and Pro), 17 mitochondrial, and three dual-localized aaRSs (GlnRS, GlyRS, and LysRS) present in both the cytoplasm and mitochondria [17]. In light of the primary sequence and the corresponding structure, cytoplasmic and mitochondrial aaRSs are considered to have archaeal and eubacterial origins, respectively. This likely reflects the endosymbiotic origin of eukaryotes, where a eubacterial cell was engulfed by archaea [18]. Reflecting these origins, the catalytic pathway of the human enzymes is identical to that found in prokaryotes (Figure 1).
Due to their common biochemical activity, aaRS members were originally hypothesized to have evolved from a single common ancestor. This hypothesis was originally supported by crystal structures of aaRSs determined prior to 1989 including Bacillus stearothermophilus TyrRS [19], E. coli MetRS [20], and GlnRS [21], which showed significant structural homology between each other. Later, studies showed that two structurally distinct classes exist, and that the above representatives all coincidently belonged to the so-called class I group (Table 1). The catalytic domains of all class I aaRSs adopt a Rossmann fold (RF, also known as dinucleotide-binding fold) and the domain features a five-stranded parallel β-sheet connected by α-helices (Figure 2a,b). Class I enzymes are usually monomeric, possessing two conserved HIGH (His-Ile-Gly-His) and KMSKS (Lys-Met-Ser-Lys-Ser) signature sequences that are involved in ATP binding [22][23] (Figure 2b). At the primary sequence level, these two motifs cap the two ends of the catalytic domain. The HIGH sequence is located at the amino terminus being present in the first α-helix of the RF, whereas KMSKS is typically found in a loop just after the fifth strand close to the carboxyl-terminal of the RF (Figure 2). The anticodon binding domain, responsible for the binding of the tRNA anticodon region, is typically located downstream of the catalytic domain [24].
Figure 2. The architectures of the catalytic sites of class I and II aaRSs. (a) The catalytic domains and signature sequence motifs in class I (left) and class II (right) aaRSs. (b) The binding pockets of ATP and amino acid in class I (left) and class II (right) aaRSs. The crystal structure of B. stearothermophilus TrpRS (Bs-TrpRS) in complex with ATP and tryptophanamide (PDB entry: 1MAU) represents class I aaRSs while the structure of Enterococcus faecalis ProRS (Ef-ProRS) in complex with ATP and prolinol (PDB entry: 2J3M) as the representative for class II aaRSs. For clarity, we only show the catalytic domains corresponding to residues 1–221 for Bs-TrpRS and residues 19–214 and 405–456 for Ef-ProRS. Class I conserved HIGH and KMSKS sequence motifs are colored in magenta and orange, respectively. Motifs 1, 2, and 3 of the class II aaRSs are shown in orange, magenta, and grey, respectively. The binding cavities of ATP (slate) and amino acid (yellow) are shown as semi-transparent surface representations with ligands shown as sticks.
Table 1. Subgroups of class I and class II aminoacyl-tRNA synthetases (aaRSs).
Class I aaRSs | Class II aaRSs | ||||
---|---|---|---|---|---|
Ia | Ib | Ic | IIa | IIb | IIc |
CysRS | ArgRS | TrpRS | GlyRS 2 | AsnRS | AlaRS |
IleRS | GlnRS | TyrRS | HisRS | AspRS | GlyRS 4 |
LeuRS | GluRS | ProRS | LysRS 3 | PheRS | |
MetRS | LysRS 1 | SerRS | PylRS | ||
ValRS | ThrRS | SepRS |
1 Monomeric LysRSs are found in many archaea and some bacteria. 2 Archaeal and eukaryotic GlyRSs are homodimers belonging to Subclass IIa. 3 Class II LysRSs are dimeric. 4 Bacterial GlyRSs are tetramers containing α2β2 subunits.
Sequence analysis has shown that the class I signature motifs only apply to 10 members of the 20 standard aaRSs. Using a limited number of sequences, the group of D. Moras defined the remaining 10 aaRS as a separate class II group (Table 1 and Figure 2) that all share three cryptically conserved motifs [25]. This classification was further supported by X-ray crystal structures of E. coli SerRS [26] and the yeast AspRS·tRNAAsp complex [27], which demonstrated a very different catalytic core architecture to that found in class I enzymes (Figure 2). The common catalytic domain identified in all class II aaRSs is composed of a six-stranded β-sheet flanked by a number of α-helices (Figure 2a) and most class II members are homodimers. The three conserved motifs are important for the biological assembly and aminoacylation activity of class II aaRSs [24]. Briefly, motif 1 is composed of a long α-helix connected with a β-strand terminated by a highly conserved proline residue and is involved in the dimer interface, while motifs 2 and 3 form the ATP and amino acid binding sites where ATP is maintained in a horseshoe conformation (Figure 2b).
In addition to the structural differences, the two-group classification also reflects distinct biochemical properties of the members. Typically, the class I enzymes approach the tRNA acceptor stem from the minor groove and catalyze aminoacylation directly to the 2′-hydroxyl group of the terminal adenosine A76 [22][23][28]. In contrast, all class II family members approach the acceptor stem of tRNA from the major groove and transfer the amino acid to the 3′-hydroxyl of the terminal adenosine, with the exception of PheRS, which aminoacylates the 2′-hydroxyl of the tRNAPhe [29].
On the basis of the similarity of the structures and sequences, class I aaRSs can be further divided into three subgroups: Ia, Ib, and Ic [2][30][31] (Table 1). The enzymes IleRS, LeuRS, MetRS, ValRS, and CysRS are grouped together as class Ia, specific for the catalysis of amino acids with aliphatic side chains or containing sulfur. The members of subclass Ib comprising ArgRS, GlnRS, GluRS, and archaeal LysRS recognize polar amino acids and require the presence of corresponding tRNA for the first-step adenylate formation [3][4][5][6]. Dimeric TyrRS and TrpRS, belonging to subclass Ic, are responsible for the aminoacylation of aromatic amino acids.
The class II aaRSs can also be divided into three subgroups based on sequence identity. Subgroup IIa including HisRS, ProRS, SerRS, ThrRS, and archaeal/eukaryotic GlyRS have a homologous C-terminal anti-codon binding domain responsible for recognizing the anticodon loop of cognate tRNAs except SerRS. This latter aaRS recognizes six variant tRNAs for coding serine through the long variable arm [32]. AspRS, AsnRS, and LysRS are usually grouped together as class IIb and contain an N-terminal extension, which interacts with the anticodon stem of the corresponding tRNA in which all have a conserved uracil in the middle of the anticodon sequence. The subclass IIc consists of AlaRS, bacterial GlyRS, PheRS, SepRS, and PylRS. Most of the synthetases in this subclass are tetramers assembling as α4 or α2β2, with the exception of PylRS, which only forms a dimer [24][33].
To ensure the fidelity of the aminoacylation reaction, aaRSs possess the ability to discriminate the specific amino acids and their cognate tRNAs in the complex intracellular environment, employing both passive substrate selectivity and active editing mechanisms [34]. Defects in aaRS proof-reading result in a series of amino acid related toxicities causing cell death in microorganisms and some neurological diseases in mammals [35][36].
Recognition of the correct tRNA by aaRSs is supported by the presence of positive identity elements, which facilitate productive interactions between enzyme and cognate tRNA, and negative elements that avoid mischarging of noncognate tRNA. While most determinants of tRNA are located in the acceptor stem and the anticodon loop, there are some determinants that are unique to specific tRNA species. These include G-1 recognition in tRNAHis [37], the deviating Levitt base pair G15:G48 in tRNACys [38], and the wobble base pair G3:U70 in tRNAAla [39]. In the case of SerRS and LeuRS, both having six variant tRNA substrates, instead of the anticodon loop, the variable arm of the tRNA is the major determinant [34].
In comparison with the feasible discrimination of tRNA, binding of the correct amino acid is much more challenging. AaRSs achieve amino acid substrate specificity, ensuring fidelity by employing a double-sieve model. The first filter involves the preferential binding of the correct amino acid and the second involves selective editing of any mis-charged tRNA. The active site of aaRSs acts as the first sieve, preferentially activating the cognate amino acid by combining selective and direct interactions between the aaRS and the specific amino acid, and steric occlusion, which prevents the binding of larger amino acids. Activation of structurally similar proteinogenic, non-proteinogenic, and smaller amino acids is often observed but typically at a lower catalytic rate [40].
Incorrectly activated amino acids subsequently undergo a proof-reading process either by pre-transfer or post-transfer editing in the editing site, which serves as the second sieve [41][42]. Pre-transfer editing refers to the hydrolysis of mis-activated aa-AMP to amino acid and AMP prior to transfer of the amino acid to the 3′-terminal of the tRNA [42] via a tRNA-independent or tRNA-dependent pathway. Class II SerRS, ProRS, LysRS, and class I MetRS use a tRNA-independent hydrolysis pathway occurring in their active sites. For example, E. coli and Saccharomyces cerevisiae SerRSs cannot discriminate threonine (having one more methyl group in its side chain) and cysteine (substituting hydroxyl with a thiol group) from serine. It has been shown that S. cerevisiae SerRS catalyzes the hydrolysis of misformed Thr-AMP significantly faster than the spontaneous hydrolysis rate of this intermediate in solution, indicating that the hydrolysis of Thr-AMP occurs in the active site of SerRS [43]. However, class I IleRS and LeuRS perform pre-transfer editing to degrade the non-cognate Val-AMP and Ile-AMP intermediate, respectively, in a tRNA-dependent fashion [44].
Post-transfer editing is usually processed in a distinct domain rather than the active site in aaRSs, where the incorrect aminoacyl-tRNA species are broken down to non-cognate amino acid and tRNA. On the basis of extensive structural and biochemical studies, there are two proposed post-transfer editing models: a direct translocation pathway for class I LeuRS, IleRS, and ValRS, and a dissociation-reassociation pathway for class II ThrRS, AlaRS, and PheRS [34][45]. All LeuRS, IleRS, and ValRS possess a connective polypeptide 1 (CP1) domain, known as the editing domain, which is connected with the main enzyme body via two β-strands. As shown by crystal structures of E. coli LeuRS, the misacylated 3′-end of tRNALeu can be directly translocated to the editing domain, completing the hydrolysis process, while the rest of the tRNA body preserves interactions with LeuRS through significant conformational changes of the multiple domains of the enzyme [46]. This is likely the basis for the lower rate of final product aminoacyl-tRNA release for most class I families [47]. However, for class II aaRSs, the rate limiting step of aminoacylation is the aa-AMP formation, not the aa-tRNA release. Therefore, for this family, it is proposed that incorrect aa-tRNA is rapidly dissociated from the enzyme and then rebounds to perform the post-editing step. This has been evidenced by biochemical studies of recombinant fragments of E. coli AlaRS, which showed that AlaRS recognizes mischarged tRNAAla involving a distinct structural domain separated from that used during aminoacylation [48].
For some tRNA synthetases, the editing domains appear to be separate proteins, also known as trans-editing factors, which contribute to translation fidelity by hydrolyzing incorrectly aminoacylated tRNA [34]. These trans-editing factors, which can be considered as a third sieve, are homologous to the editing domains in class II aaRSs and mechanistically function in the same way [34][49]. To date, several autonomous editing factors have been identified for class II ProRS, ThrRS, and AlaRS based on sequence analysis. In the case of Haemophilus influenzae, ProRS can mischarge alanine and cysteine, but lacks the editing activity for Cys-tRNAPro. However, it was found that the hydrolysis of Cys-tRNAPro is mediated by an additional protein known as YbaK [50]. Freestanding editing paralogs have also been identified for ThrRS in crenarchaea. ThrRS-cat and ThrRS-ed are encoded by two individual genes in the chromosome. The former synthesizes Thr-tRNAThr, but also produces mis-charged Ser-tRNAThr as it lacks editing activity. However, ThrRS-ed lacks aminoacylation activity, but can deacylate the mis-charged Ser-tRNAThr [51]. In addition, trans-editing factors termed AlaXps play an important role serving as the third sieve to prevent tRNAAla being mischarged with serine or glycine [52].
This entry is adapted from the peer-reviewed paper 10.3390/ijms22041750