PE_PGRS33 and humoral immune response: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Subjects: Cell Biology

PE_PGRS proteins are surface antigens of Mycobacterium tuberculosis (Mtb) and a few other pathogenic mycobacteria. The PE_PGRS33 protein is among the most studied PE_PGRSs. It is known that the PE domain of PE_PGRS33 is required for the protein translocation through the mycobacterial cell wall, where the PGRS domain remains available for interaction with host receptors. Interaction with Toll like receptor 2 (TLR2) promotes secretion of inflammatory chemokines and cytokines, which are key in the immunopathogenesis of tuberculosis (TB). Here, we address key challenges in the development of a TB vaccine and attempt to provide a rationale for the development of new vaccines aimed at fostering a humoral response against Mtb. We show that the PGRS domain of PE_PGRS33 exposes four PGII sandwiches on the outer surface, which we propose to be directly involved through their loops in the interactions with the host receptors and as such are promising targets for a vaccination strategy aimed at inducing a humoral response. 

  • protein structure

1. Introduction

Tuberculosis (TB) is still the world’s leading infectious cause of death, according to the World Health Organization (WHO) [1,2]. Its etiological agent, Mycobacterium tuberculosis (Mtb), kills approximately two million people every year and latently infects one third of the world’s population. Latency is one of the most remarkable features of TB infection, where Mtb establishes a dynamic equilibrium with the host immune system that lasts for lifetime, with no signs or symptoms of disease [3,4]. It is assumed that during latency, Mtb persists in host tissues mostly in a dormant state. Resuscitation from dormancy, which is orchestrated by a set of cell wall hydrolases [5,6,7,8,9], is a regular event in the homeostasis of Mtb infection that continuously replenishes the bulk of replicating bacilli after their elimination by the host immune response [10,11]. TB reactivation occurs when the equilibrium between Mtb and the host immune response is broken in favor of bacterial replication and tissue damage.
Active TB disease is curable with long-lasting multidrug therapeutic regimens, but the emergence of drug-resistant TB represents a major obstacle to future TB care, with important economic and social consequences [12]. Ambitious goals for better controlling the global TB epidemic can be met with the development of new drug treatments, improved diagnostics, and most importantly the availability of a new and more effective vaccine [1,2]. At present, and 100 years after its introduction, Mycobacterium bovis Bacille Calmette and Guérin (BCG) is the only vaccine available for TB control [13,14], though its protective activity in preventing TB in adults is variable, incomplete, and overall insufficient [15,16,17]. There is an urgent need for a new and more effective vaccine, yet poor understanding of the complex relationship between Mtb and the human immune system, paired with the lack of immunological correlates of protection, makes this endeavor challenging [18]. 

2. Structural Features of PE_PGRS33, the PE_PGRS Prototype

Knowing the three-dimensional structure of an antigen provides important insights into the understanding of the molecular nature of host–pathogen interactions and of the key epitopes that may serve as a target for the host antibody response. Although structural data on PE_PGRS proteins are not available, insightful information can be obtained by modeling techniques, learning from homologous proteins. Among PE_PGRS proteins, PE_PGRS33 is one of the best studied for its interaction with the immune system and has been considered a model for PE_PGRSs. As such, its putative role as a vaccine candidate is worth being investigated [81]. PE_PGRS33 is a large protein of 498 residues, with a modular architecture. A search in the PFAM database only identifies a PE domain at the N-terminal region of the protein, a small domain (residues 1–93) that is distributed nearly exclusively in actinobacteria, with only one exception for Tulasnella calospora, a genus of patch-forming fungi in the Tulasnellaceae family. A conserved linker GRLPI domain (l-GRPLI), likely acting as a transmembrane anchor, connects the PE domain to a distinctive region, not predicted by PFAM, and characterized by multiple repeats containing the GGA-GGX motif interspersed with unique sequences, and commonly denoted as PGRS domain (Figure 1).
Figure 1. Domain organization and sequence of PE_PGRS33. In the PE domain, the conserved PE motif is colored red. In the PGRS domain, sequences of the four PGII sandwich motifs are colored green, purple, blue, and cyan.

2.1. The PE Domain

The PE domain takes its name from the conserved Pro–Glu (PE) amino acids at its N-terminus (residues 7–8) [55Figure 1). This domain is responsible for PE_PGRS33 translocation via ESX5 and cell wall localization, with a significant role of 30 amino acids on its N-terminus [82]. A search in the PFAM database shows that the PE family is a member of clan EsxAB (CL0352), which also includes the more common PPE family and the WXG100 family, including the well-known antigen ESAT-6 (6 kDa early secreted antigenic target) and CFP-10 (10 kDa culture filtrate protein) in Mtb or EsxA (ESAT-6-like extracellularly secreted protein A) and EsxB in Staphylococcus aureus.
Crystal structures have been reported for the two PE domains PE8 and PE25, in both cases forming a heterocomplex with PPE partners (Table 2). In all structures, PE and PPE interact via a hydrophobic interface forming a four-helix bundle formed by two α-helices from the PE and two α-helices from the PPE module (Figure 2). The crystal structure of the ESX-5-secreted PE25–PPE41 heterodimer in complex with the ESX-5-encoded cytoplasmic chaperone EspG5 shows that EspG5 binds to a highly conserved hydrophobic chaperone-binding sequence on PPE, named as the hh motif [83] (Figure 2).
Figure 2. Cartoon representation of the crystal structure of PE25-PPE41 (orange/purple) ternary complex with the chaperone EspG5 (green). The signature ESX secretory motif YxxxE/D is located at the C-terminal side of PE25 (orange), whereas the EspG5 binding region is located on the HH motif of PPE41 (gray) [83].
Table 2. Structures of PE-like domains in Mycobacterium tuberculosis (Mtb).
By binding PPE, EspG5 protects the aggregation-prone hh motif on PPE proteins and keeps the dimers in a secretion-competent state. Consistently, point mutations of this conserved hh motif affect protein secretion [83]. Both in the cases of PE25–PPE41 and PE8–PPE15, the binding of EspG5 chaperone does not cause conformational changes in the heterodimers. The two ternary complexes present highly similar structures. A superposition of their structures using DALI produces root mean square deviations (rmsd), computed on the backbone atoms of PE, PPE, and EspG5 chains, of 1.1 Å, 2.1 Å, and 0.5 Å, respectively. Importantly, EspG5 binds the PE–PPE dimers at a location that does not interfere with the signature ESX secretory motif YxxxD/E at the C-terminal side of PE proteins [87] (Figure 2).
As mentioned above, no structural information on the PE domains from PE_PGRS proteins is hitherto known. Neither it is known whether PE_PGRS proteins strictly require a PPE-like domain, as in the case of PE proteins in Table 2. Here, we fill this structural gap by adopting homology modeling. The best template was identified by HHPRED as the PE domain the ESX-5-secreted PE8 (PDB code 5xsf, sequence identity 45.8%) and the homology model built using MODELLER [88,89,90,91]. As a result, the homology model of the PE domain of PE_PGRS33 shows that all hydrophobic/aromatic residues are located on one side of the molecule (Figure 3). This feature, also observed for the PE domains of PE/PPE complexes, suggests that either the PE domain of PE_PGRS33 forms homodimers or it is prone to interact with another protein to form a heterodimer. It is hitherto not clear whether PE_PGRS proteins require a protein partner [59,86], as in the case of PE/PPE proteins. Indeed, pe_pgrs genes are expressed as single operons. Also, the PE-unique LipY protein does not require a partner to be secreted [92]. These findings suggest that PE_PGRSs can be stable on their own, albeit being endowed with prone-to-interact PE domains for their functions.
Figure 3. Cartoon representations of the homology model of the PE domain of PE_PGRS33. Front and side views are reported on left and right sides, respectively. The model was computed with MODELLER using the structure of the PE25 domain from a type VII secretion system of Mycobacterium tuberculosis (Mtb) as a template (PDB core 4w4k, sequence identity 37%). Hydrophobic and aromatic residues are drawn in stick representation.
Importantly, the PE domain of PE_PGRS33 is required for the protein translocation through the mycobacterial cell wall [63,82,88,90,91]. Once exerted this role, the PE domain is cleaved from the rest of the molecule, leaving the functional PGRS domain floating on the mycomembrane [59]. Therefore, it is tempting to surmise that some hydrolases may recognize PE_PGRS33 through its PE domain. Mtb encodes for a number of PE- and PPE-containing serine α/β hydrolases, which are possible candidates as PE_PGRS33 hydrolases [93] (Figure 4). In addition to these, a PE_PGRS aspartic-type endopeptidase, denoted as PecA in M. marinum and PE_PGRS35 in Mtb, is known to cleave the lipase LipY [92]. More investigations are needed to verify which hydrolase is responsible for PE cleavage of PE_PGRS33 and if more hydrolases cleave specific PE_PGRS proteins.
Figure 4. Domain architecture of PE- or PPE-containing Mtb proteins (strain H37Rv), which are predicted to embed a serine hydrolase domain [93]. The last hydrolase, PE_PGRS35, is the Mtb homolog of M. marinum PecA [92].

2.2. PGRS Domain Contains Multiple PGII Modules

In PE_PGRS proteins, the PGRS domain can vary in size from tens to almost 1800 amino acid residues. Its main feature is the presence of multiple repeats containing the GGA-GGX motif interspersed with unique sequences [94]. It has been shown that PGRS domains are available on the mycobacterial surface and can directly interact with host components, as TLR2 receptors [60,74,95]. To date, the structure of the PGRS domain remains unknown and should be implemented in experimental data.
The lack of structural data on PGRS domains makes the understanding of the role of these domains a hard task. However, a high sequence identity of the C-terminal part of the PGRS domain of PE_PGRS33 exists with the PGII domain of snow flea antifreeze protein sfAFP from Hypogastrura harveyi (sequence identity 60% with residues 406–486). Therefore, we performed homology modeling based on the target–template alignment using ProMod3 and the structure of sfAFP as a template (PDB code 2pne). The alignment of the sequence of this C-terminal PGII module against the entire PGRS region identifies further three modules with the same pattern and sequence identities ranging between 63.0% and 53.9% (Figure S1). This analysis shows that the PGRS domain of PE_PGRS33 is formed by four PGII domains, denoted here as PGII1, PGII2, PGII3, and PGII4, all with similar structural features.
Polyglycine conformations, such as PGII, are the most flexible ones because the lack of side chains in glycine removes steric hindrances. Consequently, extended regions of the Ramachandran plot are allowed for glycine residues, which can virtually assume any ψ angle [96]. Typical of PGII conformation, each glycine-rich triplet folds into a left-handed, elongated helix with a pitch (rise per turn) of 9.2 Å. This conformation resembles that observed for the polyproline type II (PPII) helices found in collagen [97,98]. In the PGII sandwich, six antiparallel PGII helices are stacked in two antiparallel groups, with three to four triplets spanning the PGII domain length (Figure 5). The organization of the PGRS region in PGII domains explains the high abundance of glycine residues in these domains. Glycine residues are always pointing inwards, in positions where only glycine could be sterically allowed (Figure 5).
Figure 5. Stick representation of the homology model of the PGII sandwich domain PGII4, computed using the structure of sfAFP as a template (pdb code 2pne). The front view (A) shows the six PGII helices, whereas the side view (B) shows the localization of hydrophobic residues (e.g., L432, I433, L461) on the lateral loops. The inset shows glycine residues pointing inside and stabilizing the tightly packed PGII helices.
A comparison with the structure of the PE_PGRS33 PGII domains with that of the antifreeze protein sfAFP highlights different surface characteristics, albeit presenting the same fold, likely due to completely different functions of PGII domains in the two proteins (Figure 6). In PE_PGRS33, hydrophobic residues of PGII domains are located mainly on loop regions (Figure 5B and Figure 6A), likely accounting for a role of these residues in host recognition. Consistently, as will be shown later, removal of consecutive residues belonging to PGII domains of PE_PGRS33 (alleles from 48 to 52, Table 3) results in weaker immunostimulatory activity, in terms of reduced TNF-α [72,74]. By contrast, hydrophobic residues are mostly located on one side of the PGII structure of sfAFP (Figure 6B) [99]. The accumulation of hydrophobic residues on one side of the sandwich builds a module with one hydrophilic face and one hydrophobic face, and the flat hydrophobic face is supposed to interact tightly with the highly ordered water molecules found at the surface of an ice crystal [99]. In this respect, the flatness that characterizes this domain is functional for its tight association with the ice surface [99]. Interestingly, a PGII sandwich was also observed in the Salmonella bacteriophage S16 long tail fiber. In this case, this PGII sandwich domain plays a role in the interactions of the phage with its host, with its exposed (hydrophobic) loops being determinants of host binding. Therefore, similar to PGII of PE_PGRS33, the glycine-rich core of the PGII sandwich of S16 long tail fiber exposes hypervariable β-turn loops that determine receptor specificity (Figure 6C).
Figure 6. Surface and stick representations of PGII domains in (A) PE_PGRS33 (domain PGII4); (B) antifreeze protein sfAFP (pdb code 2pne) in two 180° views; and (CSalmonella bacteriophage S16 long tail fiber (pdb code 6F45). In this panel, the PGII domain is located at the C-terminal side of the protein (stick and surface representation). Adjacent domains are drawn in surface and cartoon representations. In all panels, the color code used for PGII residues is red for negative, blue for positive, green for hydrophobic, and light blue for polar residues.
In contrast to the antifreeze protein sfAFP, it is likely that in the case of both PE_PGRS33 and the bacteriophage S16 long tail fiber, the flatness of the PGII sandwich is useful to allow recognition loops to be closely spaced. In both cases, these loops evolve rapidly, as confirmed by their hypervariable nature, in a similar manner as observed for the three hypervariable complementarity-determining regions (CDRs) of immunoglobulins [100,101]. As in the case of S16 tail fiber, the PGII loops of PE_PGRS33 are likely exposed to the host and, as such, the principal targets of antibodies.

3. PE_PGRS33 as a Promising Target of the Humoral Response

PE_PGRS33 promotes cell death and increases mycobacterial survival in macrophages, as demonstrated by heterologous expression in Mycobacterium smegmatis (Ms), in a process mainly governed by its PGRS domain [74,102,103]. Being localized on the outer surface of the Mtb cell wall, PE_PGRS33 is in a position that results in the exposure to the milieu and in the capacity to interact with the host [95] (Figure 7A). Consistently, PE_PGRS33 was shown to specifically interact with TLR-2 [71,74], though it remains to be determined whether TLR2 requires heterodimerization with TLR1 or TLR6 to properly bind PE_PGRS33, and the role of coreceptors as CD14 or CD36 as well.
Figure 7. Schematic representations of (A) the PE_PGRS33 path to the mycobacterial membrane and (B) immune responses to PE_PGRS33 by the host.
The binding of PE_PGRS33 with TLR-2 on macrophages can activate two different intracellular pathways. Activation of the Myd88 pathway triggers the expression of genes coding pro-inflammatory chemokines and cytokines in an NFkB-dependent mechanism. Secretion of tumor necrosis factor-α (TNF-α) promotes cell necrosis and inflammation, which increases the survival of Mtb inside the host [74] (Figure 7B). Activation of the PI3K pathway triggers the inside-out signaling pathway, which enhances Mtb internalization in host cells while dampening macrophage antimicrobial responses [71]. Interestingly, a polyclonal antiserum raised against native PE_PGRS33 was shown to inhibit Mtb entry into macrophages, without affecting the entry of the Mtb Δpe_pgrs33 strain [80]. This same Δpe_pgrs33 Mtb strain has impaired capacity to enter macrophages, likely due to the lack of interaction with TLR-2 [71,74]. Mice immunized with recombinant PE_PGRS33 were able to combat recombinant M. smegmatis overexpressing PE_PGRS33 in vivo [80]. Hence, PE_PGRS33/TLR2 interaction promotes inflammation and tissue damage while favoring Mtb replication in the host lesions, raising PE_PGRS33 as an important Mtb virulence factor [72]. This experimental evidence provides functional clues to the hypothesis that PE_PGRS33 may serve as a vaccine antigen candidate against TB [81,104,105]. Antibodies binding PE_PGRS33 on the Mtb surface may neutralize the binding with TLR2, turning off a pathogenetic pathway that promotes TB disease. PE_PGRS33-specific antibodies may also opsonize Mtb, prompting a more efficient uptake and killing by activated macrophages.
Mtb-infected and BCG-vaccinated subjects make antibodies against PE_PGRS33, with the key epitopes located mainly in the PGRS domain [104]. Most of the experimental evidence gathered so far has relied on the use of denatured recombinant PE_PGRS33 [80], making it difficult to identify the relevant epitopes responsible for the interaction with host receptors and potential targets of a protective humoral response.
Conversely, as outlined in this review, structural features of the PGRS portion of PE_PGRS33 well account for the role of this protein as a target of antibodies. The organization of PGRS in PGII sandwich modules crossing the external membrane allows the protein to efficiently expose loops for epitope recognition (Figure 7). The identification of these epitopes and the fine characterization at the atomic level of the interaction between TLR2 and PE_PGRS33 would be very useful to tailor effective immunization strategies. Altogether, these studies highlight the potential use of PE_PGRS33 as a target of a neutralizing humoral response against TB [80].

This entry is adapted from the peer-reviewed paper 10.3390/cells10010161

This entry is offline, you can click here to edit this entry!
Video Production Service