The concept of sense and antisense (i.e., complementary) peptide interaction was developed in the early 1980s by Root-Bernstein, Biro, Blalock, Mekler, Siemion, and others [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. First, it was theoretically assumed and later empirically observed that peptides consisting of amino acids specified by sense and antisense sequences interact with higher probability and affinity than randomly selected peptides ( Table 1 and Table 2 , Figure 1 ). This approach was successfully applied to the investigations of more than 50 ligand–acceptor (receptor) systems, including the immune response to viral subunits and related manipulations with an epitope and paratope design [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22].
Sense peptides are essential and specific parts of viral and other proteins that elicit normal and pathologic immune responses [6][7][8][9][14][19][20][21]. Using antisense peptide technology, they could be utilized to derive targeted tests for different antibody (Ab), hormone, growth factor, or cell subpopulations [4][5][6][7][8][9][13][14][17][18][19][20][21][22][23]. The potential of antisense peptides is twofold: 1. as future diagnostic tests targeting protein epitopes or paratopes of interest, or 2. as future therapeutic agents that target specific parts of antigens to selectively modify host immune response (e.g., an antisense peptide may disrupt or modify different factors like virulence, replication or host defense) [6][7][8][9][13][14][18][19][20][21]. Consequently, sense-antisense peptide interactions may serve as a useful starting point for: 1. the development of biochemical assays for the evaluation of the immune response, and 2. modeling and design of new peptide binders for specific proteins and their receptors.
Each of these specific problems is worth addressing.
2.3. Amino Acid Bias in Characterizing Ligand-Acceptor (Receptor) Interactions
The huge difference in the number of possible antisense peptides available for sequence selection and possible database screening depends on the direction of the mRNA translation. The standard genetic code table specifies the translation of antisense (or complementary) peptides in two directions.
Table 2 shows that 27 antisense amino acid pairs are derived by the 3′ → 5′ translation direction and significantly more (52 pairs) are obtained using the 5′ → 3′ direction algorithm
[16][17][18][19][20]. The latter result is due to the fact that 5′ → 3′ antisense translation of the genetic code is based on 16 groups of codons, while 3′ → 5′ antisense translation depends on only four codon groups
[19]. According to Siemion et al.
[13][19], there are three main hypotheses concerning the interaction of sense-antisense peptides based on complementary coding principles.
The Mekler-Blalock antisense hypothesis is based on the hydropathic complementarity principle of sense and antisense peptide interactions, named Molecular Recognition Theory (MRT), which is independent of the direction of triplet reading, since the central (second) base of the coding triplet specifies the hydropathy of the amino acid
[7][8][9][13][19].
According to Root-Bernstein, the antisense approach in the 3′ → 5′ direction applies to peptides of <20 amino acids that may lack specific secondary and tertiary structure
[3][4][5][6][18][19][20]. Such design leads to significantly fewer antisense peptides and represents a plausible solution for the screening of bioactive ligands
[3][4][5][6][18][19][20].
The Siemion hypothesis of sense-antisense peptide interaction is based on the periodicity of the genetic code, i.e., the Siemion one-step mutation ring of the code, and the resulting sense-antisense amino acid pairs are in most cases similar to the 3′ → 5′ translation direction
[13][19].
The clustering of amino acid pairs, according to interaction preference, is defined by the complementary U ↔ A and C ↔ G bases of the second codon base. The second codon base, according to Woese, specifies the physicochemical properties of the amino acids
[24][34]. Therefore, it is not surprising that diverse amino acid properties—like hydrophobicity, hydrophilicity, lipophilicity, and molecular descriptors of contact potential (Miyazawa-Jernigan), hydrophobic moment, and intrinsic disorder—follow the identical sense and antisense complementarity clustering scheme that is associated with molecular interaction at the peptide level (≥4 aa)
[7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24]. In a recent article, Štambuk et al.
[19] emphasized that “the natural genetic coding algorithm for sense and antisense peptide interactions combines elements of amino acid physico-chemical properties, stereochemical interactions, and bidirectional transcription”. The relationship of the genetic code and amino acid polarity with respect to protein structure and temperature conditions are discussed in reference
[24] and the related Data in Brief article.
2.5. Amino Acid Coding, Complementarity, and Frameshifts
Amino acid coding with respect to complementary protein constructs, mutation, and frameshifts have been studied by many authors, including Arques and Michel
[41][42], Bartonek et al.
[43], McGuire and Holmes
[21], Štambuk
[44][45], Wichmann et al.
[46][47], and Youvan et al.
[48][49][50].
A recent article by Bartonek et al.
[43] showed that a frameshifting mechanism could be an effective evolutionary strategy for generating novel proteins with mostly unchanged physicochemical properties. Nevertheless, an important aspect of frameshift coding related to antisense/complementary sequences needs to be addressed. In 1996, Arques and Michel
[41][42] identified a complementary circular code of trinucleotides (X) which on average has the highest occurrence in the reading frame (X
0) compared to the two shifted frames (X
1 and X
2).
This code was found in the protein coding genes of bacteria, archaea, eukaryotes, plasmids, and viruses
[42][51]. It enables the reading frames to be retrieved in genes without start codons and with a window length of ≥13 nucleotides
[41][42]. The frame X
0 consists of 12 amino acids (A, N, D, Q, E, G, I, L, F, T, Y, V), while frames X
1 (A, R, C, I, L, K, M, P, S, T, V) and X
2 (A, R, C, Q, G, H, L, P, S, W, Y) have 11 amino acids each
[41][42][51]. With respect to the antisense codon and amino acid translation in the 5′ → 3′ direction, the X
0 frame of the circular code is self-complementary, and X
1 and X
2 frames are mutually complementary
[41][42]. In 1999, Štambuk showed that the combinatorial necklace model enables the use of coding theory arithmetic in the analyses of the genetic code and circular code antisense translations
[24][44][45][52].
Two seemingly opposite biological coding rules are characteristic for the interpretation of the SGC frameshifts and related mathematics—including complementary transformations within frames. They both deal with the mechanisms of translation error-control and flexibility and could have an important impact on SGC repertoire manipulations.
The first coding rule is that X
0, X
1, and X
2 frames of the circular code distinguish three possible reading frames of the protein-coding sequence since hidden stop codons in X
1 and X
2 prevent off-X
0-frame protein translation—this procedure is often named ambush hypothesis
[53][54], and it is thought to ensure accurate translation.
Paradoxically, the second coding rule—related to SGC flexibility—is that stop codon readthrough may be promoted by the nucleotide environment, with glutamine (Q), tyrosine (Y), and lysine (K) inserted at UAA and UAG stop codons, whereas tryptophan (W), cysteine (C), and arginine (R) could be inserted at a UGA stop codon
[55][56].
Considering bioengineering modeling, a reduced number of amino acids in frames X
0, X
1, and X
2 match the criteria for the use of simplified amino acid alphabets for engineering purposes and related sample space reductions
[57]. Consequently, we measured the relationships of the main amino acid (aa) properties addressed by Bartonek et al.
[43] in the frames X
0, X
1, and X
2 of the complementary circular code
[41][42]. The factors of amino acid polarity, secondary structure, molecular volume, diversity, and electrostatic charge by Atchley et al.
[58] were correlated to scales of nucleobase/amino acid interaction preferences for guanine (GUA), purines (PUR), and pyrimidines (PYR)
[43][59].
A significant rise in the correlation of amino acid polarity to preference scales for guanine GUA, PUR, and PYR was observed in frame X
0 (
Table 3). In frame X
1 (shifts +1 and −2), we found a strong correlation between codon and amino acid diversity factor and GUA, PUR, and PYR scales (
Table 3). This observation is not surprising, since Atchley et al.
[58] reported that diversity factor exhibits a highly significant correlation to amino acid physiochemical attributes and substitution matrices, and the X
1 frame is specified by the second codon base, which is associated with the majority of such information
[24][34][60].
Table 3. Correlations between amino acid factors and preference scales in frames 0 and +1 (−2).
Parameter |
Polarity (20 aa) |
Polarity (X | 0 | , 12 aa) |
Diversity (X | 1 | , 11 aa) |
GUA—nucleobase preference |
−0.54 * |
−0.63 * |
0.71 * |
PUR—nucleobase preference |
−0.07 |
−0.49 * |
0.82 * |
PYR—nucleobase preference |
0.06 |
0.49 * |
–0.85 * |
* p < 0.05 (Pearson’s R); aa = amino acid.
However, in frame X
2 (shifts +2 and −1), correlations between physiochemical factors and nucleobase preference scales were not significant. This observation is in agreement with recent findings that, contrary to X
1, the frame X
2 of the complementary circular code is less optimized than the SGC to reduce the effects of +2 and −1 frameshifts, in particular with respect to the physicochemical properties of amino acids
[51].
A rise in correlation among amino acid factors and nucleobase preference scales in frames X
0 and X
1 of the circular codes may reflect the importance of the first two bases for the variables encoding scheme
[24][34][60], and points to a possible application of GUA, PUR, and PYR scales
[43][59] to different genetic code analyses. In our opinion, comparative investigations of complementary circular code and SGC—concerning frameshifts, error-correction, evolution, and biological engineering—seem to be justified.
As emphasized by Choi et al.
[61], “ribosome is intrinsically susceptible to frameshift before its translocation and this transient state is prolonged by the presence of a precisely positioned downstream mRNA structure”. Additionally, according to Rozov et al.
[62], ribosome also “prohibits the G-U wobble geometry at the first position of the codon–anticodon helix”. Therefore, it is not surprising that programmed ribosomal frameshifting enables reverse-genetics approaches and the construction of modified viruses with engineered deletions and/or foreign inserts
[63].
Such engineering procedures could be used: 1. for artificial control of gene expression at the translation level, and 2. to generate differentiable marker vaccines and modified live virus vaccines
[61][63]. More details on the challenges and perspectives of reverse vaccinology (RV) approaches may be found in Van Regenmortel
[64] and Moxon, Reche, and Rappuoli
[65].
3. Perspective
The applicability of APT was confirmed recently for the magnetic particle enzyme immunoassay (MPEIA,
Figure 2b) and immunohistochemical procedures
[19][20]. This opens a perspective for the development of a new class of efficient immunochemical assays based on short peptide technology
[18][19][20]. Additionally, it was also shown that modern computational methods enable a new approach to the studies of sense and antisense peptide interactions
[20]. Several free web-based services for protein structure prediction and modeling (e.g., I-TASSER, Phyre2, PEP-FOLD 3, CABS-dock) enable accurate protein-peptide docking, i.e., in silico search for the peptide binding sites
[20][28].
Small molecules and peptides may be also used for blocking protein-protein and protein-peptide interactions. In addition to NMR and X-ray crystallographic methods and mutational data, computational and virtual spectroscopy methods—such as the informational spectrum method (ISM)—could be also used to define hot spots in proteins
[18][20]. An APT-based approach is also useful for peptide interaction and pharmacophore modeling
[32][35]. The application of artificial proteins in the context of APT is also a plausible method to derive new antisense modulators of the protein interactions
[19][24][66][67].
APT could be easily adapted to magnetic and polystyrene bead assays, conventional ELISAs, and multiplex assays, so it is possible to achieve two major lines of quick and sensitive assay development: 1. MPEIAs read with appropriate absorbance readers, and 2. Multiplex ELISAs read with appropriate imagers (e.g., with a high-resolution chemiluminescence readers for printed microtiter plates)
[19][20].
Developing new immunoassays is important for situations such as the infection outbreaks due to the possibility to design—in a relatively short time—quick, inexpensive, and simple assays that could be automated to obtain medium/high throughput screenings of particular binders, peptide motifs, and antibodies, etc. If carefully selected, such laboratory techniques enable the experimental application of different laboratory procedures which, depending on the experimental design, may be used for:
-
selection of different targets and evaluation of complementary (sense–antisense) peptide binding;
-
quantification of specific antibodies, peptides, and proteins;
-
design of MPEIAs and Multiplex ELISAs tailored for a specific purpose.
The benefits of APT outweigh the costs of medium/high throughput screening and random peptide libraries and could lead to considerable savings in time and money. Practical applications and benefits of APT application are:
-
Quick design and validation of the complementary ligands and acceptors;
-
Computational validation and virtual screening of different protein and peptide structures;
-
Rationalization of peptide library screening;
-
The tests can be produced in a short period of time;
-
The tests will be made composite (according to the LEGO principle) and will consist of less expensive and commercially available components;
-
The time required to obtain results is shorter (since no antibody production is needed);
-
The test enables large quantity sample testing using standard laboratory equipment (since it does not require special reagents or complicated sampling processing);
-
The tests are likely to prove important for the investigation of the immune response, disease pathogenesis, and clinical outcome of different infections;
-
Designed antisense peptides (and anti-antisenses
[21]) may also provide a basis for further development of vaccines and lead compounds for different diseases;
-
Detection of mutant strains is quicker since new antisense peptide motifs could be synthesized, evaluated for binding, and easily linked to magnetic particles in a short period of time, which avoids the antibody production process;
-
A green chemistry approach significantly reduces or avoids the loss of animal life.