HIV protease inhibitors against the viral protease are often hampered by drug resistance mutations in protease and in the viral substrate Gag. To overcome this drug resistance and inhibit viral maturation, targeting Gag alongside protease rather than targeting protease alone may be more efficient.
1. Introduction
Many anti-HIV drugs interfere directly with the viral life cycle by targeting key viral enzymes
[1], e.g., reverse transcriptase inhibitors
[2][3][4], integrase inhibitors
[5][6], and protease inhibitors
[7][8]. While such efforts are already hampered by the emergence of drug resistance mutations in the enzymes (e.g., in
[9]), the scenario further worsens when viral enzyme substrates, such as Gag (HIV protease substrate), are found to synergistically contribute to drug resistance.
Gag and protease play key roles in the viral maturation process
[10] where the immature HIV virion matures into the infectious virion after budding from the infected cell for the next replication cycle. Proteolysis of Gag by protease occurs during the early stage of this maturation (
Figure 1A), in which the intact full length Gag precursor polyprotein is cleaved by the viral protease into functional subunits
[10] . To inhibit this proteolysis, protease inhibitors (PIs) block protease activity in a competitive manner with Gag for protease binding
[11].
Figure 1. An overview of the Gag and Protease relationship. (A) A schematic of the early stage of viral maturation where HIV-1 Protease cleaves Gag into the functional subunits: Matrix (MA), capsid (CA), nucleocapsid (NC), p6, and two spacer peptides p1 and p2. (B) To inhibit viral maturation, protease inhibitors (PIs in green) are used to competitively inhibit protease binding of Gag. PI resistant mutations are denoted by colored stars, where those in the protease catalytic site are in blue, while those in Gag are red for cleavage sites, and purple for non-cleavage sites.
2. Possible Targets in Gag
The Gag polyprotein consists of components matrix (MA), capsid (CA), nucleocapsid (NC), p6, and two spacer peptides p1 and p2. The MA subunit, located at the N-terminus, is essential for targeting Gag to the cell membrane, while the CA forms a shell to protect the viral RNA genome and other core proteins during maturation. The NC is responsible for RNA packing and encapsidation
[12] while the two spacer peptides p1 and p2 regulate the rate and the sequential cleavage process of Gag by protease
[13]. This process of viral assembly is complemented by viral budding moderated by the small Proline-rich p6. Mutations at either the N-terminal or C-terminal of these core proteins were reported to block viral assembly and impair Gag binding to plasma membrane, thereby inhibiting viral budding
[12].
Since the Gag cleavage sites do not share a consensus sequence (
Figure 2), the recognition of the cleavage sites by protease is likely to be based on their asymmetric three-dimensional structures
[14] that would fit into the substrate-binding pocket of protease
[15]. The cleavage of these scissile bonds (seven-residue peptide sequences unique for each cleavage site) are highly regulated and occur at differing rates
[16][13][17]. The first cleavage occurs at the site between the p2 peptide and NC domain (
Figure 2), followed by the MA from CA–p2 at a rate that is ~14-fold slower than that of the first cleavage, before proceeding to release p6 from the NC-p1 domain (at a rate ~9-fold slower than the first cleavage). At the last step, the two spacer peptides p1 and p2 are cleaved from NC-p1 and CA–p2 at rates ~350-fold and ~400-fold, respectively, slower than the initial cleavage
[16][13][15][17].
Figure 2. The sequential Gag proteolysis by Protease. The cleavage sites are marked by the 7-residues, along with the estimated cleavage rates
[13] marked by arrows. For easy comparison, the initial cleavage site rate is set to the value of 1, while the other cleavage site values depict the reduced normalized rate. The cleavage site sequences are colored based on their physicochemical properties, e.g., hydrophobic (
black), charged (positive:
blue, negative:
red), polar (other colors), and varied in text sizes based on positional conservation, using WebLogo
[18][19]. Structural surface presentations of the cleavage sites are also attached for visualization.
To date, there are nine PIs, i.e., Saquinavir (SQV), Ritonavir (RTV), Indinavir (IDV), Nelfinavir (NFV), Fos/Amprenavir (FPV/APV), Lopinavir (LPV), Atazanavir (ATV), Tipranavir (TPV), and Darunavir (DRV) in clinical treatment regimes
[15]. With increasing PI resistance
[20][21][22][23] and cross-resistance
[16][21][24][25] conferred by protease mutations that compromise viral fitness, there is a compromise between enzymatic activity and drug inhibition by protease within its 99-residue homodimer subunits. Mapped to the resistance to several current PIs
[26][27][28][29], many mutations were found to spontaneously arise as part of the natural variance
[30] selected for during the treatment regimes. These mutations directly intervene with PI binding via steric perturbation at the active site, and those distant from the active site allosterically modulated protease activity
[31][32][33][34][35][36][37][38][39][40][41]. However, such mutations often reduce viral fitness, resulting in future repertoires of viruses with compromised fitness
[42]. This fitness trade-off is then compensated by additional mutations that restore enzymatic activity to an extent
[33][37][38][43].
To fully study the Gag–protease synergy, there is a need to study the limitations and mechanisms by which Gag mutations arise. Although the sequencing of clinical samples is the predominant source of HIV sequences, there are attempts to study and generate novel mutations
[47][45][46][46]) for various HIV proteins. One example of such an effort
[47] involved subjecting the Gag mRNA transcript to HIV reverse transcriptase (RT) to explore the repertoire of possible Gag mutations in the absence of drug or immune selection pressures. It was shown that clinically reported mutations could be generated and that the location and type of mutations incidentally avoided crucial locations and drastic changes. While such selection-free platforms can reveal the possible repertoires of Gag mutations for inhibitor design against emerging resistance, the large permutations require focusing through structural analysis for comparison to known clinical mutations
taking into consideration the in-built mutational biases in the genetic code.
Characterized clinical Gag mutations
[48][49][50][24][51][16] are sparse, with many reported to restore reduced binding to mutated proteases
[48][49][50][24][51][16][52]. The lack of a high-resolution structure of full-length Gag for study of these mutations makes it difficult to analyze structurally the effects of these mutations on the whole Gag during its binding to protease. Fortunately, the recent full length model of Gag
[53][54] allowed some investigation of non-cleavage site mutations.
3. The Role of Gag Mutations in Restoring Gag–Protease Synergy in PI Resistance
The mapping of Gag mutations associated with protease drug resistant mutations are summarized in
Table 1. Gag cleavage site mutations at the p1/p6 (L449F) and NC/p1 (L449F-Q430R-A431V) sections were found to be associated to protease mutation I84V
[16][55]. Similarly, Gag mutations A431V and I437V were mapped to protease mutation V82A
[16][56]. Apart from compensating the loss of viral fitness, mutations P453L (Gag) and I50V (Protease) synergistically mitigated Amprenavir effectiveness (e.g., increasing IC
50 value of Amprenavir) and Gag mutations A431V-I437V together with protease V82A were found to lead to Indinavir resistance
[16] .
Table 1. Gag and Protease paired mutations compensating for viral fitness and viral replication. Gag mutations are colored according to domains: MA (red), CA (green), NC (magenta), and p6 (orange).
|
|

|

|
Inhibitor
|
Strain or Lab Clone
|
Mutations on Gag
|
Mutations on Protease
|
Amprenavir
|
HIV-1 NL4-3 (pNL4-3)
|
V35I–L75R–H219Q
|
L10F–V32I–M46I–I84V
|
Amprenavir
|
HIV-1 NL4-3 (pNL4-3)
|
L75R–H219Q–R409K–L449F–E468K
|
L10F–V32I–M46I–I84V
|
Amprenavir
|
HIV-1 NL4-3 (pNL4-3)
|
E12K–V35I–L75R–H219Q–V390D–R409K–L449F–E468K
|
L10F–V32I–M46I–I54M–A71V–I84V
|
JE–2147
|
HIV-1 NL4-3 (pNL4-3)
|
H219Q–V390D
|
M46I–I84V
|
JE–2147
|
HIV-1 NL4-3 (pNL4-3)
|
H219Q–V390D–R409K–L449F
|
V32I–M46I–I47V–V82I–I84V
|
KNI–272
|
HIV-1 NL4-3 (pNL4-3)
|
V35I–E40K–G123E–H219Q–G381S–R409K–A431V
|
V32I–M46I–A71V–V82I–I84V
|
UIC–94003
|
HIV-1 NL4-3 (pNL4-3)
|
E12K–E40K–G123E–Q199H–H219Q–R409K–G412D–L449F–E468K
|
L10F–M46I–I50V–A71V
|
Amprenavir
|
HIV-1 HXB2
|
P453L
|
I50V
|
BILA–1906BS
|
HIV-1 strain IIIB
|
L449F
|
M46L–A71V–I84V
|
BILA–2185BS
|
HIV-1 strain IIIB
|
L449F–Q430R–A431V
|
L23I–V32I–M46I–I47V–I54M–A71V–I84V
|
Indinavir
|
HIV-1 pNL4.3
|
A431V–I437V
|
V82A
|
Ritonavir/Saquinavir
|
HIV-1 subtype B #
|
A431V–L449F
|
I84V
|
# the study involves patients.
Non-cleavage site mutations associated with PI resistance
[57][51], included H219Q and R409K for Amprenavir, JE-2147, KNI-272, and UIC-94003 resistance. Gag L75R and H219Q together with Protease mutation I84V, led to Amprenavir and JE-2147 resistance. Together, these non-cleavage site mutations (synergistically with E12K, V390D, and R409K) delayed resistance to other PIs, e.g., Ritonavir and Nelfinavir
[57]. Interestingly, most of these Gag non-cleavage site mutations are located on the MA–CA or p1–p6 domains. Gag MA domain mutations (e.g., R76K, Y79F, and T81A) were suggested to enhance Protease accessibility to Gag cleavage sites
[58][59]. Nonetheless, the exact mechanism of such non-cleavage mutations remains elusive due to the lack of full-length Gag structure and its sequentially cleaved subunits.
Limited structural research
[34][36][53][60] have revealed an underlying allosteric mechanism in resistance development by Gag non-cleavage mutations that allosterically rendered the first cleavage site to be more flexible
[53]. When coupled with protease mutations, several Gag compensatory mutations recovered protease binding affinities. Thus, the Gag and protease mutations synergistically formed a resistance network against multiple PIs
[26][60][61]. By mapping these Gag–protease resistance relationships (
Figure 3) onto our previously constructed PI cross-resistance network
[60], similar combinations of Gag mutations were found to resist varied PIs, independent of their diverse chemical scaffolds
[62].
Figure 3. A schematic map of associated Gag–Protease drug resistant mutations. Mutation hotspots are shown on both the Gag and Protease, and representatives of paired combinations of Gag and Protease mutations are shown in boxes. More details can be found in Table 1. Gag mutations are colored according to domains: MA (red), CA (green), NC (magenta), and p6 (orange).