1. Types of Programs for Prediction and Development of New AMPs
Over the past decades, great efforts have been made to discover AMPs in natural sources using wet biology techniques. Classical chromatographic methods for the detection of natural AMPs have played an important role in shaping the modern view of AMP in terms of biological source, amino acid sequence, three-dimensional structure, and antimicrobial activity. Since the 90s of the twentieth century, there are more and more possibilities for predicting and designing AMP with a specific function, pharmacological target and activity. The first report of such a discovery was associated with human cathelicidin, this peptide was detected using the method of the conservative sequence alignment of the pro-peptide domain
[1]. In 2004, Yount and Yeaman discovered a motive for identifying new AMPs
[2]. In addition, more efficient genomic and proteomic methods for the identification of peptides with antimicrobial function have been developed for peptides in recent years
[3][4]. The application of prediction methods to genomes and proteomes, followed by experimental verification, accelerates the rate of discovery of peptides, which, in turn, allows expanding understanding of the functional nature of these compounds.
The similarity of peptides with and without antimicrobial activity is a bottleneck for bioinformatics methods. To solve this problem, programs for predicting antimicrobial activity based on various mathematic algorithms and information sources are used.
The programs of the first category use only data on the primary structure of mature AMPs. Based on these data, the G-X-C motif was discovered in peptides containing disulfide bonds (defensins)
[5]. Using this motif, Yount and Yeaman discovered previously unknown peptides brazzein and charybdotoxin, which are active against bacteria and fungi
Candida albicans [2]. A total of 28 new human defensins was identified. Programs in this category determine the length of the peptide, its charge, the content of hydrophobic residues in the molecule, and its amino acid composition. The predictions are made according to the correspondence of these peptide parameters to the range of parameters from the database.
The programs of the second category use highly conserved sequences of amino acid residues of pro-peptides. The amino acid sequences of mature peptides are more variable than the sequences of pro-peptides. This observation lies in the strategy of searching for such peptides
[6]. Since the discovery of the first (antimicrobial) peptide, cathelicidin, in 1988, more than 100 such peptides have been identified. Analysis of processing signals allows predicting antimicrobial activity of other classes of AMPs. For example, amphibian antimicrobial peptide precursors also have a common and highly conserved pro-region, usually terminating in a typical Lys-Arg processing signal. This discovery was used as the basis for identifying many amphibian peptides
[4][7].
A combination of approaches from previous types of programs is used in the third category. Fjell et al. created a program for prediction based on both pro-peptide sequences and mature peptides
[8]. Peptides from Antimicrobial Sequences Database (AMSDb) are classified into several clusters, after which prediction parameters are calculated
[9]. The authors identified 146 clusters for mature peptides and 40 clusters for pro-peptides. Using this program, it was possible to achieve 99% prediction accuracy. However, it is unclear how many more clusters can be found based on the most recent collection of naturally occurring AMPs
[10], and whether it is possible to make the prediction even better.
The fourth type of programs is based on the search for homologous sequences of enzymes that carry out the transfer or modification of AMPs. LanM is a group of proteins that modify peptides from the lantibiotics family. The search for homologs of this group of proteins led to the discovery of new peptides in this family—haloduracin and lichenicidin
[11]. Using this approach, 89 LanM homologues were identified; some of them were found in 61 strains that do not produce lantibiotics. The conserved LanT transporter was used to detect other lantibiotics. Based on the similarity with this protein, Singh and Sareen identified 54 bacterial strains containing LanT homologues
[12]. Morton et al. used a protein cluster required for modification, transport, and resistance to bacteriocins for prediction
[13]. Using this program, the authors predicted bacteriocin gene blocks for 2773 genomes.
The last type of program uses genetic information about expression, processing, and transport, and also performs comparisons with the already described AMPs. The processing of eukaryotic transcription data using data mining tools has allowed the identification of new AMPs. Lynn et al. identified 9 novel chicken AMPs by searching for homology tags of clustered expressed sequences using BLAST and a hidden Markov model
[14]. Amaral et al. identified about a hundred AMPs based on physicochemical characteristics: peptide length, total charge surface and hydrophobic moment
[15]. Proteolytic cleavage of proteins can release AMP. Based on this fact, Torrent et al. developed a theoretical method for identifying potentially active regions with antimicrobial activity
[16]. Hellinger et al. achieved significant success by combining transcriptomic and proteomic data
[17]. They were able to identify 164 AMPs from single plant: 108 based on transcriptomic data only and 127 based on mass spectrometry data. Perhaps one of the most effective strategies for searching of natural candidates for AMPs is based on in silico methods of analysis of RNAs synthesized after immunization of host cells with bacteria
[18].
It is important to note that in order to predict AMPs that will act by the mechanism of coaggregation with the target protein (), it is first of all necessary to search for amyloidogenic regions in this protein
[19][20]. Approaches that include algorithms to predict aggregation/amyloidogenic sites may be useful for this purpose
[21][22][23][24]. In addition, information on specific folded/unfolded or rigidity/flexibility regions of the protein may be appropriate for the development of AMP based on the sequence of the target protein
[25][26].
Figure 1. Schematic representation of the main aspects associated with the development of new antibacterial peptides.
It is important to take into account AMP development programs that are used to predict a possible immune response and allergic reactions to a specific peptide structure
[27]. This validation is especially important for drug development based on AMPs, in particular, antiviral drugs, which will be discussed in more detail in the final section of this review.
2. SARS-CoV-2 Like an Object for Prediction and Development of New AMPs
Coronaviruses (CoVs) are a diverse group of viruses that infect a variety of animals, including live animals, including livestock, poultry, and can cause mild to severe respiratory infections in humans. Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a novel highly transmissible, pathogenic coronavirus that emerged at the end of 2019 and triggered a pandemic of new acute respiratory disease, also known as coronavirus disease 2019 (COVID-19). Previously, Severe Acute Respiratory Syndrome Coronavirus 1 (SARS-CoV-1) and Middle East Respiratory Syndrome Coronavirus (MERS-CoV) caused an outbreak of unusual viral pneumonia in 2002 and 2012, respectively
[28]. 216 countries and regions from all six continents have reported more than 65 million COVID-19 cases, and more than 1.5 million patients have died, according to data from
www.worldometer.info on December 4, 2020. The spread of SARS-CoV-2 is currently a serious threat to the health and life of people around the world. The efforts of scientists and specialists are aimed at developing effective methods of treating coronavirus infection. Clinical trials of various treatments for SARS-CoV-2 are currently underway, but none have been approved yet. Thus, SARS-CoV-2 is a relevant object for demonstrating the features of a strategy for the development of AMPs against a specific virus. It is well known that some peptides have antimicrobial activity against viruses. For example, melittin has significant antiviral activity and reduces the infectivity of enterovirus, human immunodeficiency virus (HIV), influenza A viruses, vesicular stomatitis virus (VSV), and some other viruses
[29]. Thus, AMPs, derived from cathelicidin, effectively suppress Ebola virus the infection
[30].
The genome of SARS-CoV-2 is around 30 kb, and is a positive-sense single-stranded RNA (+ssRNA) that contains six functional open reading frames (ORFs): spike (S), nucleocapsid (N), membrane (M), envelope (E), and replicase (ORF1a/ORF1b)
[31]. The SARS-CoV-2 S protein consists of two subunits. The S1 subunit is important for binding to the host cell angiotensin-converting enzyme 2 (ACE2) receptor and the S2 subunit is responsible for the fusion of the virus and the cell membrane
[32]. The S1 region is further divided into two functional domains, N-terminal domain (NTD) and three C-terminal domains (CTDs)
[33]. CTDs contain a region (amino acids 319–529) of the receptor-binding domain (RBD), which plays a key role in contact with the hACE2 receptor
[34]. Fourteen RBD residues bind to the hACE2 receptor. These are Tyr449, Tyr453, Asn487, Tyr489, Gly496, Thr500, Gly502, and Tyr505 are conserved in SARS-CoV-2 RBD, while Leu455, Phe456, Phe486, Gln493, Gln498 and Asn501 are substituted in different forms RBDs of SARS-CoV-1 and SARS-CoV-2
[35]. Biochemical analysis and pseudovirus penetration assay confirmed that the structural features of SARS-CoV-2 RBD have a higher hACE2 binding affinity than SARS-CoV RBDs
[36]. The S2 subunit contains regions required for membrane fusion, including two heptad repeats (HR), a transmembrane domain (TM), an internal membrane fusion peptide (FP), and a membrane-proximal external region (MPER)
[37].
Infection mechanisms have been investigated and it has been highlighted that the SARS-CoV-2 viral cell-surface spike protein targets hACE2 receptors. Cannalire et al.
[38] discussed that the SBP1 peptide has been identified to inhibit the S/ACE2 interaction by targeting the RBD region of the S protein, but the RBD region is more prone to mutation, making it difficult to develop broad-spectrum inhibitors. In this case, the preferred approach is fusion inhibitors targeting the more stable heptad repeats (HRs) involved in the membrane fusion process. Among the peptides described, the EK1C4 lipopeptide exhibits strong S-mediated inhibition of fusion combined with selective broad antiviral activity in cellular assays against SARS-CoV-2 and other relevant CoVs such as MERS-CoV
[38]. Another study used a programmatic approach to search and identification based on the hACE2 sequence of ten peptides that have a high potential for interaction with CoV-RBD
[39]. Drugs targeting this mechanism are not yet available, but the recent work has demonstrated that peptide hACE2 mimics, creating from the H1 helix and consisting only of natural amino acids, block infection of human lung cells with IC50 in the nM range
[40]. It is about eliminating the effect on the renin–angiotensin system and the kinin–kallikrein system in the light of the pathophysiological mechanisms associated with SARS-CoV-2
[41][42].
Similarly, to fusion proteins of other coronaviruses, SARS-CoV-2 S is activated by cellular proteases. It is assumed that the S1 and S2 subunits remain non-covalently linked after cleavage
[43]. It is known that furin protease of the host cell cleaves the SARS-CoV-2 at the S1/S2 site. Cleavage at the S1/S2 site (residues 669−688) and subsequent cleavage at the S2’ site (residues 808−820) by the TMPRSS2 protease is important for virus penetration into lung cells
[44][45]. Tavassoly et al. presented an interesting view that a peptide that is cut between two sites (S1/S2 and S2′) is detached from the virus and enters the intra- or extracellular environment. Subsequently, this peptide can induce some immunological reactions and act as a functional amyloid. The latter hypothesis was supported by the data of the AGGRESCAN program, which identified sites in the S-CoV peptide responsible for the self-aggregation properties
[46]. Because the SARS-CoV-2 spike protein is critical for penetration into the host cell, it could be an interesting target protein option for the development of antiviral peptides. The identification of regions of a protein molecule prone to aggregation/formation of amyloid fibrils is an important step for the development of AMPs acting on the basis of a coaggregation mechanism. For this reason, we used several bioinformatics tools (FoldAmyloid, PASTA 2.0, Waltz and AGGRESCAN), which were previously used
[19][47] to predict amyloidogenic sequences in proteins. The results of predicting amyloidogenic regions for the SARS-CoV-2 S protein by various methods are shown in .
As can be seen from , amyloidogenic regions identified by any software as amyloidogenic ones are predicted within one amino acid residue, but there are no completely similar predictions for FoldAmyloid, PASTA 2.0, Waltz and AGGRESCAN. However, the prediction of the amyloidogenic region of about 30 amino acid residues at the C-terminus of the SARS-CoV-2 spike protein, common for all four programs, is noteworthy. However, this region corresponds to the transmembrane domain. Interestingly, for the envelope (E) protein in SARS-CoV-1, amyloidogenic sequences at the C-terminus were also previously noted
[48]. The same work discusses the possible role of the amyloidogenic sequence in the performance of protein E of its functions of binding to the host cell membranes and formation of ion channels. In our opinion, the amyloidogenic region in the middle of SARS-CoV-2 protein S can be the basis for the development of AMPs acting by the mechanism of directed coaggregation
[49]. We hypothesize that anti-CoV peptides can bind to RBD of spike S1 protein (). This interaction should reduce the binding affinity of the S1 to the hACE2 receptor of the target cell. A feature of this mechanism is the possibility of viral particles “sticking together”, which can prevent the spread of the virus and reduce the viral load.
Figure 2. Hypothetical mechanism of direct coaggregation against CoVs.
Based on a similar logic, a number of authors tried to synthesize peptide inhibitors of the Spike–hACE2 interaction against SARS-CoV-1 and SARS-CoV-2 viruses
[50][51][52][53][54]. As mentioned above, therapeutic peptides have several disadvantages: low bioavailability, short half-life and toxicity. Therefore, lipidation, PEGylation, glycosylation will be useful for improving the pharmacological properties of peptides. In addition, glycosylation can facilitate the specific recognition of viral particle binding sites. For example, how it happens, C-type lectins recognize pathogen patterns
[55][56][57].
Recently, promising technologies for creating a vaccine against COVID-19 have been described. In these works, it is proposed to use antigenic peptides developed on the basis of peptide epitopes of viral proteins to inhibit SARS-CoV-2 infection and based on peptide epitopes of T and B lymphocytes to stimulate the human immune response
[58]. High-performance in silico technologies currently allows the development of new approaches to the creation of peptide vaccines. Antigenic epitopes found in viral proteins can simultaneously meet several criteria, including immunodominant regions, non-allergenicity, population coverage, and lack of variability (conservatism) for efficient binding and molecular interaction with HLA (human leukocyte antigen) and TLR (toll-like receptor) alleles
[27]. To evaluate the effectiveness of peptide vaccines against SARS-CoV-2, the frequencies of the HLA haplotypes can be used to predict the coverage of the developed vaccine. At the same time, peptides are evaluated based on the frequency of HLA haplotypes or HLA alleles in the target population, peptides with undesirable properties are filtered out, which are expected to be glycosylated, identical to peptides in the human proteome, or rapidly degraded
[59]. Using in silico methods, epitopes of B cells, T cells, and IFN-gamma present on four structural proteins of the virus were mapped, and multi-epitope peptide-based vaccine was developed
[60]. It should be taken into account when developing a peptide vaccine that some peptides can provoke an increased production of interleukin 6. In general, the increased mortality of patients with COVID-19 is associated with the induction of the cytokine storm. The work predicted 222 peptide sequences based on the viral spike protein, which can cause increased production of interleukin 6 (IL-6)
[61]. The peptide protein kinase inhibitor CK2, previously proven in cancer, can also be used in the treatment of Covid-19. In a small (20 patient) randomized controlled clinical trial, CIGB-325 peptide reduced the mean number of lung lesions. However, in some patients, CIGB-325 may cause itching, redness, and rashes. In the future, it is planned to test the CIGB-325 peptide on more subjects in various combinations with antiviral drugs
[62]. Preliminary encouraging results were obtained with the intravenous administration of the CIGB-258 immunomodulatory peptide to critically or severely ill COVID-19 patients. CIGB-258 significantly reduced the levels of biomarkers associated with hyperinflammation, IL-6 and tumor necrosis factor (TNFα) during treatment
[63]. Another work suggests the use of annexin A1 Ac2-26 mimetic peptide, which reduces IL-6 production, pain and exudate, which could be a promising treatment in the fight against COVID-19
[64]. In addition, it is assumed that some short peptides, which that have already been developed and tested in the treatment of other diseases, can be used as inhibitors of the SARS-CoV-2 virus, as well as immunomodulators and bronchoprotectors of the pathological process with COVID-19
[65].
Drug redirection methods for the treatment of SARS-Cov-2 infection are currently being investigated. 300 peptide-like structures from various databases were selected. Using molecular dynamics modeling and docking analysis, the four peptide-like structures demonstrated strong binding affinity for amino acid residues within the SARS-CoV-2 site, the major protease of M
pro, also called 3CL
pro [66][67]. The evolutionary aspect of SARS-CoV-2 should be considered in order to develop peptides that can further target the virus. Thus, in order to select four peptides with strong binding affinity for the main protease SARS-CoV-2, 2765 sequences containing a wide range of mutants from patients with COVID-19 were analyzed by molecular dynamics methods
[68]. A computational approach was used to select 15 peptides that showed a higher affinity for RBD of SARS-CoV-2 S protein compared to the α-helix of the hACE2 receptor. It was noted that, in all the detected stable peptide-protein complexes, the Tyr489 and Tyr505 residues in RBD are involved in interaction, which suggests that they are critical for the binding of the discovered antiviral peptides to RBD
[69]. After modeling based on the key interacting motifs of the spike protein, four new synthetic SARS-BLOCK™ peptides have been developed and characterized, which can serve as a combined therapeutic, immune and prophylactic agent against SARS-CoV-2. The technology may be interesting if it is possible to show the low cytotoxicity of the developed peptides
[70].
Since the COVID-19 pandemic was announced, various options have been proposed to combat the infection. Recently, reviews have appeared that draw attention to the prospects for the development and use of antiviral peptides in the therapy of SARS-CoV-2, SARS-CoV-1, MERS-CoV and other respiratory viruses
[71]. In particular, there was a point of view that synthetic or natural AMPs can be used to reduce the viral load on cells by blocking the contacts of the virus with receptors on the cell surface
[72]. The use of protein/peptide inhalers, due to their prolonged action, higher efficacy, lower systemic availability and minimal toxicity, may be an effective approach for the treatment of SARS-CoV-2
[73]. Modification of existing natural AMPs is an attractive approach, since it can lead to the creation of new antiviral therapeutic agents
[74]. The study shows the promise of using nisin, a dietary AMP produced by lactic acid bacteria, which can bind to hACE2, competitively inhibiting RBD SARS-CoV-2
[75]. Against COVID-19, it is proposed to use an AMP such as lactoferrin (LF), which is an iron-binding glycoprotein. However, it should be noted that the antiviral mechanisms of LF differ from virus to virus. Lactoferrin can enhance the host’s immunity against viral infection or bind directly to the viral particle or receptors and heparan sulfate proteoglycans on the cell-surface of the host cell
[76][77]. Interestingly, a recent review suggests the use of peptide BPP-10c (Glu-Asn-Trp-Pro-His-Pro-Gln-Ile-Pro-Pro) derived from venom of
Bothrops jararaca to counteract the effects of COVID-19
[78]. AMPs, which are found in the secretion of mesenchymal stem cells (MSCs), have antimicrobial properties, the ability to attenuate cytokine storms, and, therefore are an attractive approach to prevent the COVID-19 pandemic
[79].
The influence of the coronavirus on the course of Alzheimer’s disease has been disclosed. It has been shown that Alzheimer’s disease can be triggered by various bacterial diseases, as well as certain viruses, albeit indirectly. The hypothesis is relevant due to the fact that many patients with COVID-19 have central nervous system disorders and cerebrovascular diseases. There is no data on the penetration of the virus into the human central nervous system, but this is important due to the presence of hACE2 expression in human nerve cells, and the virus can also enter the cerebrospinal fluid through the choroid plexus of the ventricles
[80]. Interestingly, Aβ(1-42) peptide associated with Alzheimer’s disease may have antiviral and, in general, antimicrobial activity. AMPs have proven to be antiviral therapy that can be effective against infection and be used to reverse the effects of virus-induced inflammation. However, it is known that viral infections are often accompanied by bacterial infections. COVID-19 is no exception, and some patients die from bacterial co-infection. It can be assumed that the SARS-Cov-2 virus has a mechanism to suppress the production of the host’s own AMPs
[81].
AMPs have immunomodulatory effects, with some activating and others suppressing inflammation, which should be considered when selecting candidates. To select specific immunogenic epitopes and accelerate vaccine development, the SARS-CoV-2 genes were analyzed for B and T cell epitope candidates. In addition, the authors propose to increase the low immunogenicity of a peptide vaccine by including an adjuvant in its composition and using an effective delivery system based on chitosan or a copolymer of lactic and glycolic acid (PLGA)
. It is suggested that peptides derived from the short-palate, lung and nasal epithelial clone-1 (SPLUNC1), alpha-1-antitrypsin (AAT), dornase alfa (DA) and neutralizing human S230 light chain antibodies can be used as anti-adhesion agents, preventing the SARS-CoV-2 virus from interacting with the human host cell. However, the use of the SPLUNC1 peptide may be associated with 33.3% of allergic reactions, based on in silico data
[73]. It has been noted that clinical use of AMP may be hampered by general properties such as sensitivity to environmental conditions, large size, and poor distribution and excretion . It has been noted that clinical use of AMP may be hampered by general properties such as sensitivity to environmental conditions, large size, and poor distribution and excretion
. In the review, the authors discuss bioengineering strategies, chemical modifications, and combined approaches to finding solutions that improve the design, development of AMP, and peptide delivery technologies.