2. Bacteriophages Used for Biocontrol of Crop Diseases
The class
Caudoviricetes is the most studied among the phage groups, as it is the order in which phages for biocontrol of agricultural crop diseases are found. Its members have an icosahedral capsid and a tail containing trimeric protein fibers in the lower part that serves as a receptor for interaction with the host cell, with stability in L morphology
[14]. The class
Caudoviricetes include phages according to the tail shape: the myovirus includes phages with long and contractile tails that stand out, the podovirus have short non-contractile tails, and the siphovirus have long and flexible tails (
Figure 1)
[14][15]. One of the most studied phages is a tailed phage with a dsDNA genome which has three main parts in its structure: a capsid, a tail, and an absorption apparatus. In the assembly process, the capsid of bacteriophages has a structure called a procapsid. Scaffolding proteins have the function of supervising other subunits of the main capsid to facilitate the formation of the icosahedral procapsid, where the dsDNA will be stored. The mechanism of change from procapsid to capsid within the genome is called the ripening process.
Figure 1. Schematic representation of the most commonly studied phages
[16]: (
A) levivirus MS2 has a capsid with icosahedral symmetry and a size of about 26 nm; (
B) microvirus ϕX174 is a non-evolved icosahedral capsid about 30 nm in size; (
C) podovirus T7 is a non-evolved icosahedral capsid about 60 nm in size; (
D) tectivirus PRD1 is a non-enveloped icosahedral capsid with a size of about 66 nm; (
E) cystovirus phi6 is an enveloped spherical virion 85 nm in diameter; outer and inner capsids have icosahedral symmetry; (
F) corticovirusPM2 is an icosahedral capsid 56 nm in diameter; (
G) myovirus T4 is non-enveloped, with a morphological head–tail structure about 110 nm in length; (
H) siphovirus T5 is non-enveloped, with a head–tail structure; head is about 60 nm in diameter; and (
I) inovirus M13 is non-enveloped, with rods of filaments 7 nm in diameter and 700 to 2000 nm in length.
The organization of phage tails is associated with the type of bacteriophage; for instance, siphovirus have long and flexible tails, podovirus have short tails with adhesive properties, and myovirus have long, rigid, contractile tails that shape the rigid internal tube and external contractile sheath. On the other hand, many phages that can infect gram-negative bacteria have an absorption apparatus, which is an oligomeric ring formed by proteins from the distal tail attached to the tailed tube’s last ring. It has the function of recognizing and connecting with receptor-binding proteins
[17].
The tectivirus genus structure consists of a rigid protein capsid containing a thick lipoprotein and a flexible vesicle and dsDNA, which gives these phages the ability to infect both gram-positive bacteria, such as the betatectivirus genus, and gram-negative bacteria with the alphatectivirus and PRD1 genus
[18], which is important, as gram-negative bacteria cause the most problems in economically important crops
[19]. The corticovirus also have lipid layers in the protein capsid; they infect gram-negative
Pseudoalteromonas spp. and are associated with a PM2 capsid architecture with fold trimeric proteins containing two β-barrels forming hexagonal capsomeres
[20].
In the case of archaeal viruses, they can be divided into two groups: In the first group, the relationships between the morphologic and genetic aspects of these viruses are unique. The second group has clear genetic and structural similarities to bacteriophages and eukaryotic viruses. The majority of archaea viruses have linear or circular dsDNA genomes, while only two families present single DNA genomes. No RNA viruses have been isolated, but they have been detected in metagenomic studies
[21].
Currently, studies are focused on the structure and assembly of archaeal viruses, which requires characterization of unexplored host phyla to increase the knowledge of archaeal virus diversity
[22].
Proteobacteria are mainly gram-negative and cause many diseases in agricultural crops. From a structural point of view, some examples show how the structure of bacteriophage is related to bacterial species that cause damage to agricultural crops. For instance,
Xanthomonas spp. are infected by phage Xaj2 and phage Xf2. Phage RSM3 infects
Ralstonia solanacearum.
Pseudomonas syringae is infected by like phage phobos or phage MR1-MR18, classified in podovirus or myovirus, and cystovirus as the phage phi6 or phi8. The most studied are myovirus phage T4 and levivirus as phage MS2, which infect
E. coli (
Figure 1)
[23][24][25][26].
3. Bacteriophage Classification
The International Committee on Taxonomy of Viruses (ICTV) has established the official taxonomy. Historically, in the past, viruses were classified based on several criteria, such as propagation characteristics in cell culture, virion morphology, serology, nucleic acid sequence, host range, pathogenicity, and epidemiology or epizootiology
[27]. Recently, new proposals have been made to improve virus classification because there is no standardized and automatic universally accepted virus classification. Technological advances in sequencing, and especially metagenomics projects, have increased the available phage genomes and have since been used for phage genomes as a criterion for classification
[28]. Essentially, Turner et al. described the abolition of the order
Caudovirales and the families
Myoviridae,
Podoviridae, and
Siphoviridae [29]. In bacteriophages, the members of the largest order (Caudovirales) were assigned to the class
Caudoviricetes. The old families (
Myoviridae, Podoviridae, and
Siphoviridae) were replaced by the families
Strabovoridae,
Drexlerviridae, and
Autographiviridae, which have a high similarity to the old families
[30][31].
In addition, the order
Tubulavirales, which includes the phages of the family
Inoviridae, was divided into two families,
Inoviridae and
Plectroviridae [32][33]. In the family
Microviridae, additional subfamilies beyond the existing
Gokushovirinae and
Bullavirinae have been proposed, namely the subfamilies
Alpavirinae,
Stokavirinae,
Aravirinae, and
Pichovirinae based on virome data. Finally, the family
Leviviridae, which described a comprehensive identification of ssRNA based on computational approaches, used the phage genome
[33][34][35][36].
4. Bacteriophage Infection Mechanisms
Bacteriophages are capable of reproducing by two biological strategies to perpetuate their genetic material. One strategy is the lytic cycle, which is the most aggressive, since it kills the host cell. The first step consists of fixation: the bacteriophage binds to the host cell through a ligand–receptor interaction
[37]. This interaction is so specific that it allows differentiation between gram-negative and gram-positive hosts
[38].
The first interactions are made by the virus to specific receptors, such as lipopolysaccharides (LPSs) or outer membrane proteins. The binding of two, and even up to six fibers to the receptor sends a signal to the base of the base plate that promotes a conformational change, leading to contraction of the fibers, which readies the bacteriophage into an injection position. Mainly, the stem pierces the membrane, generating a channel that allows the injection of genetic material. This perforation is supported by endopeptidases (N-acetylmuramyl-L-alanine amidase, lysozymes, transglycosylases), which commonly degrade peptidoglycan; these enzymes are part of the fiber structure of bacteriophages
[37]. With the genetic material introduced into the bacterium, the host’s replication, transcription, and translation machinery are hijacked to generate the genetic material and to synthesize and obtain viral proteins. First, protein messengers are synthesized to stabilize and protect DNA or RNA molecules from degradation. In DNA viruses, the genetic material is used directly as a template for transcribing genes that code for structural proteins. In RNA viruses, a reverse transcription step is required to achieve the genetic material to be replicated. The genes that encode structural proteins are transcribed, and as soon as the structural proteins are generated, the copies of genetic material begin to assemble into the virion, stabilizing the DNA or RNA molecules with proteins. Then the capsid, the stem, and the fibers are assembled with the base plate. Additionally, lytic enzymes known as lysines are synthesized, which are encoded in the genetic material injected by the bacteriophage, and are the enzymes responsible for breaking the plasma membrane and allowing external liquid to enter the cytosol of the bacterium, which takes it to a point where the membrane breaks and releases the internal content, along with the virions, to start the lytic cycle again
[37][38].
An important protein during the lytic cycle is the holin (hole-forming) protein, which exist in double-stranded DNA bacteriophages and control the length of the infectious cycle. Holins are small proteins that accumulate on the membrane and cause permeabilization. These proteins can be classified into three types according to their topology: class I with 95 residues that form three TMDs, class II with 65 to 95 residues that form two TMDs, and class III, with one TMD in the central region of the molecule
[39][40].
The second method of reproduction and conservation of viral genetic material bacteriophages that have been developed and are used widely is the lysogenic cycle, which consists of the steps of fixation, injection of genetic material, and the lytic cycle. However, in this case, viral DNA is integrated into the bacterial chromosome and remains there in an inactive form as a prophage. This allows conservation of the bacteriophage sequence, which is replicated and transferred to the bacterial daughter cells through the bacterial chromosome DNA, where the virus is also duplicated. The genetic material is integrated through binding sites located on the bacterial chromosome by factors of the bacteriophage, and not by recombination or the integrated systems of the bacteria
[37].
This strategy is more elegant and does not affect the viability of the host. In the literature, it is described as a strategy that seeks to preserve the host. Since bacteriophages are specific, it is believed that they sometimes use the lysogenic cycle to keep host cells alive, because if they carried out the lytic cycle all the time, they could exterminate their hosts and therefore would also be condemned to die
[37].
5. Strategies for Bacteriophage Isolation from Plants
In general, the bacteriophage isolation process is as follows: First, the inoculum (phage) must be isolated, and the host bacteria must be identified; it is enough to inoculate a bacterial culture with a phage inoculum and incubate it to obtain a higher bacteriophage titer, which can be clarified by centrifugation or filtration
[9].
A challenge is encountered with lytic phages, which is one of the limitations of phage therapy. The characteristics that a phage must have to be a candidate for phage therapy are lytic capacity, high progeny, and host specificity, which means it infects a single species of bacteria while leaving the rest of the microbiome intact
[41]. However, in practice, it has been reported that phages can have more than one host and can attack groups of bacterial strains, which would enable solving diseases caused by a variety of bacterial strains by using a single phage with a broad spectrum of hosts. However, this is complicated when different species of bacteria cause disease; in this case, specific phages against specific bacterial strains can no longer solve the problem. A strategy that, in some cases, has made it possible to counteract bacterial growth is the use of a “phage cocktail”, which has shown promising results
[19][42].
Therefore, a determining step in the success of phage therapy is isolation and characterization, and taking into consideration the type of sample and the host.
The primary isolation method, which was developed by Félix d’Herelle, consists of an enrichment process
[43]. First, a sample of bacteria (host) is mixed with an environmental sample, and this must be close to the area of infection to be treated. In order to obtain phages against a specific disease, it is recommended to take a sample close to the infected site, either from the leaf (5 cm
2), stem (2 cm), soil (150 g), or irrigation water (10 mL) of the infected plant. This is then mixed with the host bacterial culture
[12][44]. Generally, soil samples have the highest concentration of phages. After the mixture of bacteria (host) with the environmental sample (phages) is prepared, there is an incubation period to 28–37 °C for approximately 16 to 18 h in a shaker (180 to 200 rpm/min). The application and selection of enrichment media will depend on the chosen host bacteria and the cell biomass that must be produced
[45].
In general, the most common bacteria that cause plant diseases are
Pectobacterium, Pantoea [46],
Agrobacterium [47],
Pseudomonas [48],
Ralstonia [49],
Burkholderia [50], A
cidovorax,
Xanthomonas [51],
Clavibacter,
Streptomyces [37],
Xylella [38],
Spiroplasma, and
Phytoplasma [40]. The media used for the management of these species are Luria broth (LB), PEB medium, Lennox broth supplemented with calcium, nutrient broth (NB), semi-solid yeast extract agar (NYA), periwinkle wilt (PW) medium, and ATCC medium: 988
Spiroplasma SP-4 medium. Therefore, in the process of obtaining phages, it is necessary to characterize the host bacterial strain to obtain high titers and phage production
[38].
Accordingly, bacteria or cell debris are removed from the culture by a physical method such as centrifugation or filtration to analyze the presence of phages. The identified phages are characterized to determine the desired properties for therapy based on their virulence capacity
[9]. During the isolation process, a phase of characterizing phage properties should be included in the protocol, focused on determining the lytic capacity and spectrum of host bacteria. In some protocols, the use of chloroform is recommended for the extraction of phages. Currently, it is known that this inactivates enveloped phages, so it is not recommended; in the opposite case, the use of this organic compound would be helpful because the structure of enveloped phages has components of a lipid nature. With the isolation protocol, researchers consider obtaining the concentration that would allow obtaining a high titer of phage that has lytic capacity against a specific bacterial species
[39].
However, there are exceptions, such as when phages are found in high titers in environmental samples; researchers can mix the sample with the host bacterial cell culture to achieve sufficient titers that generate inhibition halos during the characterization and phage lytic capacity. In this case, it is recommended to use a limited volume of environmental sample by plaque assay
[45].
The phage concentration is generally low in most environmental samples, so adding an enrichment step to the isolation protocol is recommended. In the case of plants, the strategy for obtaining and isolating phages must be focused on taking samples from parts of the infected plant, such as leaves, roots, or stem or from irrigation water or soil
[45].
5.1. Obtaining Bacteriophages from Environment Samples
The aim of the process is to find a phage with a high titer on media suitable for reproduction; for instance, tomato diseases have been treated with titers of 10
6 to 10
8 plaque-forming units (PFU)/mL
[52]. The virus titer, including the phage titer, in seawater can be low, requiring preconcentration of the sample by filtration, precipitation, or both. Multiple studies recommend the use of mineral salts, such as zinc, calcium, or ferric chloride to concentrate phages from samples of seawater, wastewater, or any liquid in which the salt concentration can precipitate the phages, thus avoiding their being salvaged by water molecules, leading to phage precipitation
[53][54][55].
On the other hand, bacteriophages in diluted samples can also be concentrated by flocculation, in the form of small insoluble aggregates (flocs) even at low phage concentrations. For wastewater samples, a low-speed centrifugation step is recommended to eliminate large insoluble parts and cells from the sample
[56]. When using the filtration method, it is less common to concentrate phages from aqueous samples because the filters clog quickly. Nevertheless, it is an essential step in the phage collection protocol; most protocols end with a filtration step at 0.45 or 0.22 μm to eliminate bacteria, with the choice of filter pore size depending on the need to eliminate all bacteria. To retain very large bacteriophages, it is recommended to use a 0.45 μm membrane
[57].
In summary, after sample collection, the phage titer can be evaluated directly, or concentrating it may be recommended, using a step of the protocol to achieve the necessary titer for the infection stage; this is achieved by directly applying the sample to the medium enriched with the host bacteria, considering that the sample must have a high enough titer to achieve successful infection. Another method that is commonly recommended is to add the phage enrichment step prior to infection by precipitation (ZnCl
2 or CaCl
2, or FeCl
3) or floc and, in some cases, to use a filter to concentrate the phages and achieve a successful infection stage
[9][39].
5.2. Experimental Detection of Bacteriophages
In this stage, it is recommended to use detection methods for new phage isolates, such as spot test, plaque test, or lysis in cultivation
[58][59].
The spot test method consists of inoculating phages with the host bacteria, forming a lawn, and then placing droplets of phage on the plate. This incubation shows a lysis or halo effect related to phage activity. The advantage of this method is that it is simple and allows testing of multiple phages that are filtered on the same plate. A limitation of this method is that it requires growing the host in plate media. It is also prone to false positive results due to the lysis of bacteria by binding media components or by phages that do not lead to productive disease
[42].
The plate test method consists of placing high phage dilutions obtained from filtrate together with the bacteria on the surface of the plate by extension or coating of soft agar. The plate is previously incubated with plaques to analyze afterwards. This shows evidence of phage growth, and the plaque indicates the lytic or lysogenic cycle. The size of the lytic plaque may indicate the size of the phage due to diffusion; a disadvantage is that the host must grow to converge on the plate. Most phages cannot be plated, even with highly productive hosts, due to limited agar diffusion
[60]. In the culture lysis method, the phage filtrate is added to the bacterial culture broth and incubated for monitoring of cell lysis signals by the turbidity of the culture. Metabolic stains can also be used to measure the level of turbidity associated with metabolic activity. This method is used for bacteria that do not show confluence on the medium plate. Bacteria that grow in broth could be adapted for automation using spectrophotometry based on turbidity
[58]. However, a limitation is the occurrence of false positives due to the absence of lysis. Cellular debris can inactivate phages by charge, affecting the infectivity. Hosts that evolve rapidly to become resistant to phages will cause false negatives
[61].
In the routine dilution (RTD) method, phages are diluted to a titer that produces minor confluent cell lysis on a plate. This is used for phages that do not show morphological differences on plaques. Unfortunately, this method is inclined to produce false positives due to the fact that the media or components are not diluted
[62]. The main challenges of phage detection are the inoculation quantity and specificity of the bacterial host and the viability and disposition of the phage for increasing the probability of successful infection that is possible to detect.
5.3. Detection of Bacteriophages from the Genome for Bacterial Biocontrol
Presently, there are repositories of sequenced genomes of organisms, and information on bacteriophages is available in NCBI’s PhagesDb and GenBank
[59][63].
The genome is important in order to determine the constitution of a phage, and the host’s genotype and phenotype may be related. The phage genome is the genetic information that allows reproduction. Tailed dsDNA phages are the most studied, including tailed members of class
Caudoviricetes, and families
Straboviridae, and
Drexlerviridae [31]. Other types of RNA and ssDNA phages make up a small group; there may be more that have not yet been discovered
[10].
Another important characteristic is the size of the genetic material described, which can range from ~3300 nucleotides for ssRNA
E. coliphages up to 500 kbp for
Bacillus megaterium phage G. The size of dsDNA stem phage genomes ranges from ~11.5 kbp (Mycoplasma phage P1) to ~30 kbp (Pasteurella phage F108) for members of the families
Drexlerviridae,
Autographiviridae, and
Straboviridae [31].
The phage genome encodes all the necessary components to generate new virions with structural proteins for the capsid and stem, and ligand proteins that cover the capsid, allowing it to interact with the host cell receptors and introduce genetic material. There may be other proteins that help in the development of infection, such as proteases, which help to evade the host’s immune response
[10][41]. These genomes show the complexity of the microorganisms (bacteria and bacteriophages). In addition, researchers present an approach for comparative genome sequencing (
Figure 2). Complete genomes of bacteriophages show high variability, which is a characteristic of bacteriophages. In these analyses, the sequence of phage Salvo was used as a template, and only phage Sano had a high conservation sequence >70% (yellow), because it belongs to its homologue; the other sequences added were phage phiXc10 (blue),
Ralstonia phage RSM3 (pink), and
Agrobacterium phage Atu_phe07 (green), which showed low conservation sequences between bacteriophage genomes (<30%) (
Figure 2).
Figure 2. Comparison of bacteriophage genomes used for biocontrol of proteobacteria diseases present in agricultural crops. Complete genomes of bacteriophages were obtained from NCBI. Template sequence was phage Sano; % GC content (black) is shown.
Xanthomonas phage phiXc10 (blue) and
Ralstonia phage RSM3 (pink) show nucleotide sequence identity conservation <30%. Phage Salvo (yellow) shows identity about 70%, and
Agrobacterium phage Atu_ph07 (green). Genomes were compared in Brig version 0.95 software, Brisbane, Australia (
http://sourceforge.net/projects/brig/, accessed on 8 September 2022).
The sequences of genomes in the NCBI database showed that there are few sequenced genomes available. For the majority of bacteriophages, the genome is not available. Therefore, it is necessary to increase the number of genomes that can be obtained from other studies (assembly genomes), such as metagenomics studies, and to characterize bacteria and phage genome partners to understand host–phage relationships
[64].
Currently, computational studies seek to complement the experimental laboratory studies carried out for the characterization of bacteriophages. Multiple bioinformatics applications have been developed for faster and better determination of candidates for phage therapy. Currently, the availability of viral genomic information has allowed the development of computational tools such as machine learning languages, for an approach called the phage classification tool set (PHACTS), which can identify the type of life cycle of a bacteriophage based on its protein sequences, because it is always required for the selection of lytic phages
[65].
Another molecular implementation based on bacterial defense mechanisms is the use of mutations in receptors to prevent phages from interacting, making them unable to penetrate and inject their genetic material. Bacteria can detect regions in their genome that are susceptible to enzymatic cleavage of foreign DNA at specific sites. This is used as a molecular tool known as the Clustered Regularly Interspaced Short Palindromic Repeats/ Caspase 9 system (CRISPR/Cas system), which considers the immune system of bacteria that manage to confer resistance to phages
[66].
Currently, genomic information is important for the development of analytical tools at the bioinformatics level. Leite et al. (2018) described an automated phage identification system from a genomic phage library that is capable of detecting effective phages from genomic information through a combination of machine learning and bioinformatics, in order to understand the phage–bacteria relationship by analyzing their genomes
[66].
In this approach, a database is created in which the phage sequences are contained, and one for bacterial sequences is also created. Then, a model is created that is aimed at learning the specific characteristics of phages that interact with the bacteria, called positive interactions, and bacteria and phages are used for this probability model. Thus, there is non-experimental evidence that they have interacted, and negative interactions are obtained. Then, after selecting phage candidates based on their characteristics and positive interactions with the bacteria, they are analyzed from the deduced protein sequence at the genomic sequence level, with prediction of functional domains and protein interaction (phage–bacteria), where the focus is on membrane receptors, which is the information that feeds the machine learning model to select phage candidates from theoretical genomic and proteomic information
[67].
Recently, Amgarten et al. (2018) developed a computational tool that predicts bacteriophage sequences from metagenomic data called Metagenomic Analysis and Retrieval of Viral Elements (MARVEL), which is based on machine learning. This program feeds on groups of 1247 phage sequences and 1029 bacterial genomes, determined from fragments of sequence counting that identifies sequences corresponding to semi-techniques of phages and bacteria with significant hits for viral proteins. Interestingly, in order to validate the operation and accuracy of MARVEL identification, the authors compared the results of the analysis of the same contigs with VIRsorter and VirFinder against MARVEL functional bases of software recommended to carry out the unknown virus identification. VIRsorter is based on alignments and searches for similarities in databases of known viruses, whereas VirFinder is based on a machine learning algorithm through K-mer frequency profiles that are obtained from the contigs and taken for training of the model
[11][68][69]. The results of this study show that in the comparison of the three types of viral sequence analysis software, there were similarities except for two kbp fragments. It should be noted that MARVEL showed significant values (<0.001) for all cases analyzed with positive ranges. Therefore, this new computational tool will be able to support the design and study, as well as the identification, of phages from their genetic information
[11].
A new approach, known as VIBRANT, focuses on the annotation of viruses. It is a new hybrid method that uses machine learning and protein similarity to determine genome quality and completeness, and characterizes viral communities from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and newly developed v-score metrics to determine lytic viral genomes or prophages that are integrated. As a platform for evaluating viral community function, it was trained and validated with reference virus datasets, microbiomes, and virome data
[70].