The Prokaryotic and Eukaryotic Nat Machinery: Comparison
Please note this is a comparison between Version 1 by Markus Wirtz and Version 3 by Conner Chen.

Prokaryotic and eukaryotic N-terminal acetyltransferases (Nats)ts belong to the general control non-repressible 5 (GCN5)-related N-acetyltransferases (GNAT) superfamily which counts thousands of members in all three domains of life. 

  • compartmentalization
  • co-translational modification
  • GNAT

1. Introduction: N-Terminal Acetylation—An Underestimated Protein Modification

Protein modifications are key modulators of protein fate and are often the first-aid tool for reprogramming cells in response to developmental or environmental cues. Together with phosphorylation and ubiquitination, acetylation is one of the most pervasive protein processing events [1]. Acetylation occurs at the α-amino group of protein N-termini (N-terminal acetylation, NTA) or at the ε-amino group of internal lysine residues (lysine acetylation, KA). Both NTA and KA are present throughout all kingdoms of life and are catalyzed by N-terminal acetyltransferases (Nats) or lysine acetyltransferases (Kats) which transfer acetyl moieties from acetyl coenzyme A (AcCoA) to their respective substrates. Prokaryotic and eukaryotic Nats belong to the general control non-repressible 5 (GCN5)-related N-acetyltransferases (GNAT) superfamily which counts thousands of members in all three domains of life [2][3][4][2,3,4]. Despite their low overall sequence homology (3–23%), the three-dimensional fold and catalytic domains of GNATs are well conserved (Figure 1A). The core GNAT fold consists of six to seven β-strands (β0–β6) and four α-helices (α1–α4). The loop connecting β4 and α3 harbors a highly conserved R/QxxGxA/G motif, which mediates AcCoA binding [2][5][6][2,5,6]. In higher eukaryotes, the bulk of cytosolic proteins (>80%) is co-translationally acetylated at their N-terminus, whereas KA affects selected proteins, most prominently histones [7][8][7,8]. While KA is widely recognized as transcriptional regulator, the overall biological significance of the more prevalent NTA remains unclear [9]. At the molecular level, NTA alters the electrostatic properties of proteins by neutralizing the positive charge at their N-terminus, which results in an increased overall hydrophobicity. In addition, NTA creates a new hydrogen bond acceptor and increases the nucleophilicity and basicity of the α-amine. Taken together, these changes have profound implications for the three-dimensional structure, activity, binding properties and lifetime of individual proteins [10]. Since up to date, no N-terminal deacetylases have been identified, these changes are considered irreversible [11][12][11,12]. Hence, NTA was for a long time perceived as a nonregulated, and consequently a static, co-translational process [13]. This dogma was challenged by the identification of regulatory mechanisms for Nats and a highly diversified family of post-translational Nats in higher eukaryotes [14][15][16][17][18][19][14,15,16,17,18,19]. Specifically, in plants, the characterization of plastid-localized GNAT proteins with dual Nat and Kat activity and the phytohormone-triggered regulation of the ribosome-tethered NatA contributed to this paradigm shift [20][21][22][20,21,22].
Figure 1. The typical GNAT fold is conserved throughout all domains of life. (A) The core GNAT fold consists of six to seven β-strands (β0–β6, light grey) and four α-helices (α1–α4, dark grey). The loop connecting β4 and α3 contains a conserved AcCoA binding motif (R/QxxGxA/G, red cross). Differences between GNAT structures are generally confined to the N-terminal β0 strand. (B) NTA frequency in different organisms as a percentage of the whole proteome. The bars represent the estimated upper limit reported for the individual organisms (1: [23], 2: [20], 3: [24], 4: [25], 5: [26], 6: [27], 7: [28], 8: [29] and 9: [30]).

2. The Prokaryotic Nat Machinery

While in humans and plants more than 80% of cytosolic proteins are N-terminally acetylated [20][23][20,23], the frequency of NTA declines in single-celled organisms (Figure 1B). In yeast for instance, only 60% of the proteome is N-terminally acetylated [15].
In bacteria, NTA is an even rarer event. Unlike eukaryotes, bacteria initiate protein biosynthesis with formylated methionine (fMet). Before NTA can occur, the N-terminal formyl group has to be removed co-translationally by peptide deformylase (PDF). For the majority (60%) of proteins, deformylation is followed by the excision of the initiator methionine (iMet) by methionine aminopeptidase (MetAP). Acetylation marks were found on both N-termini with and without iMet and are added by one of the three known bacterial acetyltransferases “Ribosomal modification I” (RimI), RimJ, and RimL [30][31][30,31]. Of these three enzymes, RimJ seems to be the most promiscuous since the number of N-terminally acetylated proteins in E. coli drops significantly upon depletion of RimJ, but not RimI or RimL. RimJ predominantly targets N-termini starting with Ser and Thr, but also Ala [32]. Despite their role as ribosome-assembly factors, Rims are absent from mature ribosomes, suggesting that their catalytic activity is purely post-translational [33].
Initially, only five endogenous proteins were reported to be N-terminally acetylated in Escherichia coli, including the ribosomal proteins S5, L7/L12, and S18 as well as the elongation factor EF-Tu and the chaperone SecB [26][34][35][36][37][38][26,34,35,36,37,38]. Recent mass spectrometry-based proteome-studies expanded this originally short list of N-terminally-acetylated proteins in E. coli to over 100 entries, accounting for 10% of the E. coli proteins with experimentally assessed acetylation status [30][32][30,32]. In Pseudomonas aeruginosa PA14 and Mycobacterium tuberculosis for instance, between 18 and 29% of the proteome were found to be N-terminally-acetylated (Figure 1B) [28][29][28,29].
Acetylation levels are similar in archaea, where 13–29% of all proteins are affected by NTA [26][27][39][26,27,39]. Archaea express a single conserved Nat, which exhibits a broad substrate specificity. The active site of this Nat is a hybrid of known eukaryotic Nat active sites [40][41][40,41], suggesting that the cytosolic Nats in eukaryotes derived from this ancestral form [42]. The function of NTA in archaea has only been demonstrated for individual proteins. In the salt-loving archaea Haloferax volcanii for instance, the NTA of the α1 proteasome subunit mediates the efficiency of proteolysis by altering the conformation of the channel leading up to the proteasomal core [43]. On the organismal level, the importance of NTA in archaea remains to be elucidated.

3. The Eukaryotic Nat Machinery

So far, six evolutionary conserved Nats (NatA-F) have been identified in metazoans (Figure 2). The existence of five of those (NatA-C and NatE-F) has been experimentally confirmed in the model plant A. thaliana [20][44][45][46][47][48][20,44,45,46,47,48]. NatD has been proposed to exist in Arabidopsis based on the substantial homology to its human orthologue [7]. Unlike NatD and NatF, most cytosolic Nats are composed of one catalytic and one or more auxiliary subunits facilitating ribosome association and catalytic properties [49]. While NatA–E are thought to be ribosome-bound in humans and plants, NatF localizes to the plasma membrane in plants and the Golgi-membrane in humans [14][46][14,46]. In addition, a family of plastid-localized Nats (GNAT1-7 and GNAT10) with dual Kat/Nat activity was recently characterized in A. thaliana [21][22][21,22].
Figure 2. Phylogenetic tree of Nats from different domains of life based on protein sequence comparison. Homologous Nat sequences from the photosynthetic eukaryotes Arabidopsis thaliana (At) and Oryza sativa (Os), the non-photosynthetic eukaryotes Homo sapiens (Hs), Drosophila melanogaster (Ds) and Saccharomyces cerevisiae (Sc), as well as the bacterium Escherichia coli (Ec) and the archaeon Saccharolobus solfataricus (Ss) were aligned with ClustalW. For OsNAA50 and OsNAA60, only one protein could be identified by blasting the respective human orthologs against the rice proteome. The resulting phylogenetic tree was circularized with the iTOL tool (https://itol.embl.de, accessed on 20 October 2022).
Nats are present in all plant organs. While NatA–E and the plastidic Nats are widely expressed in aereal organs except for the male reproductive parts, NatF is most strongly expressed in anther and pollen. Although the distribution of the plastidic Nats among different tissues is similar, there are differences between the transcription patterns of the individual enyzmes, indicating that they might fullfil different roles in specific organs. However, in specific organs, transcript levels of Nats barely change upon various biotic and abiotic stresses. Furthermore, Nats may gain defined functions due to their specific subcellular compartments, which is summarized in Figure 3
Figure 3. Subcellular localization and substrate specificity of N-acetyltransferases in the model plant Arabidopsis thaliana. Catalytic subunits are schematically represented in red, whereas auxiliary subunits are depicted in orange. Subunits for which only predictions of subcellular localization are available are shown in lighter colors. From the plastid Nat family only NatG is shown for simplicity (1: [20][44][50][51][52][20,44,50,51,52]; 2: [47]; 3: [45]; 4: [46]; 5: [21][22][21,22], ?: debated in Arabidopsis). The pie chart shows the relative contribution of the individual acetyltransferases to the plant acetylome. Estimates are based on experimental data where acetyltransferases were assigned to acetylated N-termini based on their substrate specificity [20][53][20,53].
The substrate specificity of Nats is largely determined by the first two amino acids of their substrate proteins [11]. Consistent with the ability of Nats to acetylate distinct N-termini, the Nat catalytic sites differ in shape, size, and electrostatic properties (Figure 4). The catalytic mechanisms of AtNAA50 and AtNAA60 are very similar and rely on tyrosine and histidine residues that coordinate a catalytic water molecule [46][54][46,54]. Even though the catalytic mechanisms of AtNatA–NatC have not been uncovered yet, the residues required for catalysis in their human counterparts are conserved in plants [10].
Figure 4. Three-dimensional models of Arabidopsis thaliana Nats. The AcCoA-binding motives (A) of Arabidopsis Nats are strongly conserved (shown in red with conserved residues highlighted in ribbon mode). AcetylCoA is represented in grey. The Nat catalytic sites (B) have distinct surface characterizations in shape, size, and electrostatic properties, which is consistent with their ability to acetylate distinct substrate pools. Catalytically important residues were either reported in [1][2][1,2] for AtNatE and AtNatF or estimated based on their human and yeast counterparts for AtNatA–C [3] and are represented in stick mode. The crystal structures of AtNAA50 (6YZZ, green) and AtNAA60 (6TGX, cyan) were downloaded from the Protein Data Bank (https://www.rcsb.org, accessed on the 9 November 2022), whereas the three-dimensional structures of the other Nats were generated with SwissModel (https://swissmodel.expasy.org, accessed on the 9 November 2022) based on their human or yeast counterparts using the templates 6c9m.2.B (AtNAA10, blue), 7stx.1.A (AtNAA20, yellow) and 7l1k.1.A (AtNAA30, orange).
Interestingly, some proteins are not acetylated even though based on their primary sequence they fit the recognition potential of Nats. A search in the NterDB database (https://nterdb.i2bc.paris-saclay.fr/) reveals that of 1327 nuclear-encoded putative Arabidopsis NatA substrates 179 (14%) are not acetylated. Hence, substrate recognition might depend on so far unknown determinants. Those might include the three-dimensional properties of the nascent chain or competition of Nats with other ribosome-associated factors attracted by those nascent chains.
ScholarVision Creations