Bacteriophage Tail Fiber Interaction with Host Surface Receptor: Comparison
Please note this is a comparison between Version 2 by Catherine Yang and Version 1 by Jingen Zhu.

Bacteriophages (phages), as natural antibacterial agents, are being rediscovered because of the growing threat of multi- and pan-drug-resistant bacterial pathogens globally. Most phages package their genome in the proteinaceous capsid (or head) and have a tail attached to the capsid. Tailed double-stranded DNA bacteriophages belonging to the class Caudoviricetes (Cauda means “tail” in Latin) are the most prevalent (~96% of all known phages). Based on tail morphology, they are further classified into three morphotypes: myovirus, siphovirus, and podovirus. Myophages (e.g., T4, T2, Mu, S16, and φKZ) have long, rigid, contractile tails with a sheath around a central tube; siphophages (e.g., λ, T5, HK97, and SPP1) possess long, flexible, non-contractile tails; and podophages (e.g., T7, T3, P22, and φ29) have short, non-contractile tails. Of these, myophages possess the most complex tail architectures with the greatest number of proteins involved in tail assembly and function.

  • bacteriophage (phage)
  • T4 phage
  • tail fiber
  • tail fiber structure

1. Molecular and Structural Insight of the Interaction between T4 Phage Long Tail Fibers (LTF) and Escherichia coli Receptors

1.1. T4 Phage Architecture and the Structure of Long Tail Fibers (LTF)

Phage T4, the best-characterized phage studied to date, belongs to the Straboviridae family [6][1] (https://ictv.global/taxonomy (accessed on 1 October 2022)) and infects the Gram-negative Escherichia coli (E. coli) and closely related Shigella species [33][2]. T4 capsid, a 120 nm long and 86 nm wide prolate icosahedron, encapsidates a ~171 kbp linear, double-stranded DNA genome, and its exterior is decorated with Hoc (highly antigenic outer capsid protein) and Soc (small outer capsid protein) nonessential proteins (Figure 1A), which can be fused with foreign proteins for various biomedical applications [34,35,36,37,38,39][3][4][5][6][7][8]. T4’s circularly permuted genome consists of ~289 protein coding sequences and encodes 40 structural proteins, with most of them involved in tail assembly (Figure 1B) [40,41,42][9][10][11]. The capsid has a unique portal vertex [43][12] to which a 140 nm long contractile tail is attached via a neck/connector complex [44,45][13][14]. The collar and whiskers formed by Wac (or fibritin) are assembled just below the capsid–tail junction [46][15]. The tail comprises an interior rigid tube, an exterior contractile sheath surrounding the tube, and a hexagonal baseplate at the tip of the tail (Figure 1A) [47,48,49][16][17][18]. Two types of fibers, six 145 nm long tail fibers (LTFs) and six 40 nm short tail fibers (STFs), are attached to the baseplate (Figure 1A) [50,51][19][20]. The two sets of tail fibers confer to T4 phage one of the most effective infection efficiencies [52][21]. The T4 LTFs determine its host specificity via interacting with bacterial surface receptors.
Figure 1. T4 phage architecture and the structure and assembly of long tail fibers. (A) T4 phage architecture [59][22]. (B) T4 genetic map showing gene clusters with related functions and origin and direction of transcripts (arrows) [60][23]. (C) The structure and schematic of T4 long tail fiber with seventeen mass domains observed by scanning transmission microscopy [54][24]. Reprinted with permission from Elsevier. (D) The assembly of T4 long tail fiber [8][25]. (E) The structure of the T4 long tail fiber “needle” (D10 and D11) responsible for host receptor recognition [57][26]. (a) Ribbon structure of the trimeric “needle” consisting of knob, stem, and tip. The N and C termini and every 10th residue of one chain are labeled. (b,c) Surface structure of the trimeric “needle” seen from the side (b) and top (c).
The T4 LTFs can reversibly interact with Outer membrane porin C (OmpC) and LPS receptors exposed on the surface of E. coli K12 and E. coli B strains, respectively, initiating the adsorption process, which is the first step in the T4 lytic life cycle [8,50,53][19][25][27]. The kinked LTF consists of four structural gene products (gp) (Figure 1C): gp 34 (140 kDa, 1289 amino acids), gp35 (35 kDa, 372 amino acids), gp 36 (23 kDa, 221 amino acids), and gp 37 (109 kDa, 1026 amino acids), with a stoichiometry of gp34/gp35/gp36/gp37 of 3:1:3:3 [54][24]. The long and thin LTF can be divided into ~70 nm proximal and ~75 nm distal half-fibers (proximal and distal are in relation to the assembled LTF to the tail baseplate), hinged at an angle of around 160° [8,54][24][25]. The proximal half-fiber is formed by a homotrimer of gp34, followed by the hinge composed of monomeric gp35, whereas the distal half-fiber is formed by homotrimers of gp36 and gp37 (Figure 1C) [55][28]. Additionally, LTF shows a somewhat linear arrangement of the four proteins. Thus, the N-terminal end of gp34 binds to the baseplate periphery, while its C-terminal end attaches to gp35. Similarly, the N-terminal end of gp36 attaches to gp35, while the C-terminal end of gp36 binds to the N-terminal end of gp37. Finally, the C-terminal end of gp37 contains the extremely distal receptor-binding domain (RBD or “tip”) responsible for recognizing the host receptors (Figure 1C,D) [50][19].
The T4-encoded molecular chaperone gp57A is required for the correct trimerization of gp34 and gp37, and another chaperone, gp38, is required for the proper folding and functionality of gp37 [56][29]. Both gp57A and gp38 are absent in the final assembled T4 virion. For LTF assembly, homo-trimeric gp34 and gp37 assemble independently. Initially, trimeric gp36 proteins assemble on the N-terminal region of the gp37 trimer to form the distal half-fiber, and then monomeric gp35 joins to the gp36 free end to form a gp35–gp36–gp37 complex. The proximal half-fiber gp34 trimer attaches to the gp35–gp36–gp37 complex to form the final complete LTF, which then can coaxially attach to the C terminal domain of gp9 located at the upper edge of the tail baseplate (Figure 1D) [34,50][3][19].
No atomic resolution structures of whole LTFs have yet been presented because the large size and simple linear structure lead to poor crystallization [47][16], although the T4 LTF is one of the best-characterized tail fibers to date [7,57,58][26][30][31]. Seventeen mass domains of variable size and spacing have been observed in the intact LTF by scanning transmission microscopy (Figure 1D) [54][24]: five domains in the proximal half-fiber gp34 (P1 to P5), a single domain in the gp35 hinge, and eleven domains in the distal half-fiber gp36–gp37 (D1 to D11). Domains D1 and D2 close to the hinge are probably made of gp36, while domains D3 to D11 are formed by gp37 (Figure 1C,D) [51,54][20][24].
The atomic structure of trimeric D10 and D11 domains at the C terminus (residues 811–1026 of the 1026 aa gp37) has been determined [57][26]. This 20 nm-long “needle” region consists of a globular “knob” (~45-Å wide, D10), an elongated “stem” (~15-Å wide, D11), and a small “tip” (~25-Å wide, D11) (Figure 1E). Each chain of the interwoven trimer emanates from the “knob” to the end of the “tip”, twists around a neighboring chain, and turns back, with both the N and C terminus located at the “knob”. The D11 domain (residues 882–1019) inserts into the D10 “knob” domain (residues 811–881 and 1010–1026). D11 consists of two sub-domains: “stem” (residues 882–931 and 960–1009) and “tip” (residues 932–959) (Figure 1E). Most of the D11 amino acids are found in an extended conformation forming the elongated “stem” subdomain. The compact and interwoven “tip” subdomain, inserted into the “stem” subdomain and located at the distal pole of the LTF, plays a primary role in the interaction with host receptors: E. coli B type LPS (B-LPS) and E. coli K12 type OmpC (K12-OmpC) (Figure 1E) [51,57,58][20][26][31].

1.2. Molecular and Structural Insight of T4 LTFs’ Interaction with Host Receptors LPS and OmpC

Bioinformatic analysis suggests extensive sequence conservation of tail fibers from various phages and prophages, except for the “tip” domain [53,57,61][26][27][32]. The LTF “tip” has diverged with distinct shapes and sizes to acquire specific receptor-binding properties [13,14,18,57,62,63,64,65][26][33][34][35][36][37][38][39]. Therefore, understanding the molecular mechanism of interaction between the “tip” and host receptor will provide the basis for reprogramming the phage–host interaction. Here, wthe researchers highlight the molecular mechanisms of T4 LTF “tip” binding to the terminal glucose of B-LPS and to K12-OmpC, and then we sthe researchers summarize the structural model of LTFs during T4 infection initiation.

1.2.1. T4 LTF “Tip” Binding to LPS

LPS, a large glycolipid, is abundant in the outer membrane of Gram-negative bacteria (around a million molecules per cell) and is the primary receptor for phages [66][40]. LPS is generally composed of three structural domains: lipid A, the oligosaccharide core, and the distal polysaccharide (or O-antigen) [66][40]. Lipid A is hydrophobic and forms the outer leaflet of the bacterial outer membrane. The core oligosaccharide is a non-repeating oligosaccharide that is divided into two linked moieties: the inner core and the outer core [67][41]. The inner core is bound to the extracellular side of lipid A, while the outer core is extended from the inner core. The core usually contains glucose, galactose, heptose, and 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo), which can be further modified with phosphates, N-acetylglucosamin, and other substituents [67,68][41][42]. The O-antigen is attached to the outer core and consists of a repeating oligosaccharide (two to eight sugars). The LPS is classified into one of two varieties, smooth LPS (e.g., E. coli O157) or rough LPS (e.g., E. coli B and K12), based on the presence or absence of O-antigen, respectively [69][43]. The LPS can also be divided into five types based on the constitution of the oligosaccharide core [70[44][45],71], such as B-LPS and K12-LPS, which have different outer cores (Figure 2A).
Figure 2. Molecular and structural insight of T4 long tail fiber interaction with host receptors LPS and OmpC. (A) Structural schematics of T4 LPS receptors on the surface of E. coli B and K12 strains: Glu, Glucose; Hep, L-glycero-D-manno heptose; P, Phosphate; KDO, 3-deoxy-D-manno-oct-2-ulosonic acid; GAL, Galactose; and NAG, N- acetylglucosamin. Created by BioRender.com. (B) The “tip” surface structures and the critical amino acid residues involved in host receptor binding [58][31]. (a) The bottom surface structure of the “tip” showing three small cavities, each suitable for the accommodation of one glucose moiety. (b) The key residues on the bottom surface of the “tip”. (c) The key residues on the lateral surface of the “tip”. (C) OmpC viewed from the top (left) and the side (right) [75][46]. The T4 phage LTF binding components, loops 1, 4, and 5, and residues P177 and F182, are highlighted. The outer and inner membranes of E. coli K12 are indicated by the gray bars. (D) The T4 LTF “tip”-OmpC docking model at different angles [58][31]. Illustrations 2C and 2D were re-made by UCSF ChimeraX [81][47].

1.2.2. T4 LTF “Tip” Binding to OmpC

Porin serves as an aqueous pore that is abundant in the outer membrane of Gram-negative bacteria and facilitates the nonspecific diffusion of nutrients and water-soluble drugs, with a molecular mass cut-off of about 600 Da [24,82][48][49]. In E. coli, two major porins, OmpC and OmpF, represent more than 50% of the total protein integrated into the outer membrane [24][48]. OmpC is the primary receptor for the T4 phage to infect the E. coli K12 strain because T4 cannot adsorb to a K12 mutant lacking OmpC [68][42]. It is known that OmpC also serves as a receptor for other phages, such as Tulb, Hy2, AR1, and ss4 [72,83][50][51]. OmpC is organized as a trimer. Each monomer shows a β-barrel structure formed out of 16-stranded antiparallel β-sheets, with 8 internal periplasmic turns and 8 extracellular loops that connect each β-sheet [82][49]. Structural docking analysis has shown that the size of the LTF “tip” domain (~25 Å) is similar to the size of the surface cavity formed by the trimeric OmpC molecules [57][26]. The “tip” fits snugly into the OmpC outer cavity and most likely interacts with the extracellular loops. The extracellular loops 1, 4, and 5 in K12-OmpC are required for efficient T4 phage adsorption (Figure 2C) [75][46].
A number of residues at the “tip” are also found to be involved in binding to K12-OmpC. The residues that have been studied to date are: I933, N937, G938, G940, V941, G942, G943, K945, M946V, S947, Y949, I951, Y953, and A955 (Figure 2B) [33,57,58,68,75][2][26][31][42][46]. Interestingly, almost all these residues (except V941) overlap with the residues involved in B-LPS binding. Similarly, these residues are divided into two types: (1) “Loss-of-function” residues and (2) “host-range-expanded/shifted” residues. For “loss-of-function” residues, some mutations, including G940A, V941E, G942A, G943A, G943S, K945A, S947A, Y949A, I951A, Y953A, Y953R, and A955E, result in loss of interaction between the LTFs and K12-OmpC. Most of these mutations also lose infection of the E. coli K12 strain, except residues K945A, I951A, Y953A, and Y953R, which obtain the capacity of binding to K12-LPS as compensation for infection. As indicated by structural docking analysis, residues G940, G942, G943, S947, and Y949, lining the cavity surface of the “tip” bottom, interact with the amino acid residues exposed in the barrel cavity of the K12-OmpC receptor, probably via a combination of hydrogen bonds, hydrophobic interactions, and shape-complementary van der Waals interactions [58][31]. In addition, residues K945, I951, and Y953, located above the upper rim of the bottom cavity, are also suitable for interacting with residues lining the K12-OmpC barrel cavity [58,75][31][46].
Notably, mutation K945A, I951A, Y953A, or Y953R loses binding to the original K12-OmpC receptor but obtains binding capacity to a new sugar receptor K12-LPS, which provides a significant clue for more-effective phage therapeutics in the future. The reason is that the porin or Omp protein, allowing passage of drugs such as antibiotics, might not be the optimal phage receptor in practical phage therapy. If used in therapy, phage-resistant bacteria with downregulated or even defective porin might be selected. Then it would be more difficult and terrible to use antibiotics to treat these mutated bacteria because the drug passage channels will be diminished or may have vanished altogether [22,24,86,87][48][52][53][54].

1.3. Model of Long Tail Fibers (LTFs) during T4 Infection Initiation

Since phages do not have specific motion structures to move independently, initial adsorption results from random phage–host collision described by the Law of Mass Action. T4 phage infection is initiated with host recognition in which the LTF “tip” specifically and reversibly recognizes the LPS or OmpC receptor on the cell wall. The E. coli cell wall consists of two concentric lipid bilayers, the outer membrane and the inner (cytoplasmic) membrane, with peptidoglycan periplasm between [88][55]. Upon receptor recognition at the suitable site on the cell wall, a mechanical signal is transferred to the phage baseplate, causing conformation change of the baseplate. The baseplate-anchored short tail fibers (STFs) are then unpinned, rotate downward, and irreversibly bind to the lipid A-inner core region of LPS. The baseplate completes the conformation conversion from hexagonal to star shape during the tail-fiber-binding process and is oriented parallel to the cell surface. Then, contraction of the tail sheath is trigged, pushing the hollow tail tube through the host outer membrane and periplasm. The inner membrane bulges from its normal plane to fuse with the phage ejection nanomachine. Finally, a channel across the outer and inner membranes is formed, facilitating the injection of phage genome DNA into the bacterial cytoplasm for the synthesis of new virions. The infection details have been reviewed elsewhere [7,8,49,50,89][18][19][25][30][56]. Here, wthe hresearchers highlight the model of LTFs during recognition and infection initiation, which plays an important role in determining the host range (Figure 3).
Figure 3. “Touch and Search” model of LTFs during T4 infection initiation. In the free state, most phages have three to four retracted LTFs on average. Each fiber is likely in a dynamic “retracted–extended” equilibrium that does not need chemical energy to maintain. Upon infection, the trimeric LTF “tip” allows weak and unstable interaction with the receptor, probably causing the “tip” to move up and down as well as rotationally (“association–dissociation” equilibrium). The “association–dissociation” and “extended–retracted conformation” dynamic equilibriums allow the T4 phage to randomly walk across the host surface to search for an optimal infection site [51][20].

2. Engineering Strategies of Phage Tail Fiber for Reprogramming Phage Host Range

The interaction between a bacteriophage and its host is mediated by the phage’s tail fiber “tip” domain or receptor binding domain, which is thus the main engineering site for reprogramming the phage host range. In earlier days, researchers relied on the natural evolution process to increase the host range of phages. Then, the identification and characterization of the phage receptor binding domain at molecular/atomic levels allow the engineering of the tail fiber to reprogram the phage host range. More recently, the advancement of bioinformatics, machine learning, and artificial intelligence could allow rapid identification of phage–host interaction based on the characterized genome sequences of various phages and bacteria.

2.1. Host Range Widening through Natural Evolution

There is an evolutionary arms race between bacteria and phages [91][57]. Bacterial hosts have evolved multiple anti-phage tactics, such as phage receptor blocking and the CRISPR-Cas system [22,92][52][58]. On the other side, phages have also evolved corresponding strategies to avoid or circumvent these selecting pressures, such as tail fiber mutation to recognize new receptors and the anti-CRISPR system [93,94,95][59][60][61]. Researchers have been practicing this natural evolution strategy to extend the phage host range. This strategy involves growing wild-type phages in various hosts for several generations to generate mutants (mainly in tail fibers) that infect new hosts. For example, the host range of T7 phage was initially limited to E. coli and a few Shigella strains. Using natural evolution, the host range was extended to Yersinia pestis [96][62]. In another approach, Burrows et al. applied a phage cocktail to phage-resistant and -sensitive bacteria. After 30 rounds of selection, recombinant phages were isolated with significantly extended host ranges [97][63]. The limitation of natural evolution is its requirement for extensive co-culture and a labor/time-intensive selection process.

2.2. Rational Genetic Engineering of Tail Fibers

The rational genetic engineering approach requires an extensive molecular understanding of phage tail fibers that interact with hosts. The phage host range can be reprogrammed via swapping the tail or tail fiber genes with those from other phages by homology-directed recombination or synthetic engineering. Yoichi et al. changed the host range of T2 phage by swapping its gp37 and gp38 (two gene products at the tip of T2 long tail fiber) with the corresponding gene products from phage PP01 to produce a recombinant T2 phage. T2 phage normally infects E. coli K12, whereas PPO1 infects E. coli O157: H7. Interestingly, the recombined T2 phage could no longer infect its native host, E. coli K12, but was able to infect E. coli O157: H7 [98][64]. Additionally, Ando et al. developed a synthetic biology strategy for extending the host range of T7 phage via swapping whole-tail components (including tail fiber) [99][65]. They used a yeast-based platform for phage genome engineering, then they transformed the modified phage genomes into E. coli to reboot phages with novel host ranges. Using this technology, they diverted E. coli T7 phage to efficiently target and kill new bacterial hosts, including Yersinia and Klebsiella [99][65].
In addition, the phage host range can also be reprogrammed by random mutation or directed evolution in the phage tail or tail fiber region. Yosef et al. developed GOTrap (General Optimization of Transducing particle) technology to engineer T7 phage for extending foreign DNA transduction into new bacterial hosts (Figure 4A) [100][66]. In this study, the T7 wild-type whole-tail genes (gp11gp12, and gp17) were deleted from the phage genome. Then, T7 phages lacking their tail genes were applied to infect E. coli hosts encoding randomly mutated tails (by chemical mutagenesis) in a packable plasmid with a selectable marker, resulting in numerous variants of tail-mutated phage particles. As a result, the selected phages were characterized and able to deliver the desired DNA to broad new bacterial species, including KlebsiellaShigellaSalmonellaEscherichia, and Enterobacter (Figure 4A) [100][66]. The major limitation of this technology is the generation of a large plasmid library (~1012 to 1015) for creating multiple simultaneous mutations in the tail genes [100][66].
Figure 4. Representative engineering strategies of phage tail fiber. (A) GoTrap model [100][66]: (i) T7 phage without a tail fiber gene infects E. coli carrying a plasmid with antibiotic resistance marker, a T7 packaging signal, and various tail components. (ii) The new phage contains a different tail fiber (iii) that can recognize a new host for plasmid transduction. (iv) A new host containing the phage-transduced plasmid is selected on an antibiotic plate. (B) Phagebody model: The loop region (tip) of T3 phage gp17 for random mutagenesis is marked by magenta, green, yellow, and red. The selected phageobody can suppress bacterial growth [101][67]. (C) ORACLE method: An acceptor phage is generated in which the tail fiber gene is replaced with a fixed sequence flanked by CRE recombinase sites (a landing site for inserting tail variants) (yellow). The phage variants are then generated within the host by Cre-mediated optimized recombination by inserting tail fiber variants from a donor plasmid (blue, containing single amino acid substitution) into the landing sites [102][68]. Created by BioRender.com.
For widening the host range of T3 phage (a T7-like phage), Yehl et al. adopted structure-informed engineering of viral tail fibers to generate host range alterations [101][67]. Using homology modeling with the structurally characterized T7 phage, they generated the tail fiber structure of T3 and identified similar distal loops located at the tail fiber tip. These loops act as the host-determining region that binds to the LPS receptor of the host bacterial strain. Inspired by antibody specificity engineering, they created a T3 tail-mutated library with high-throughput mutations in the four outward loop regions containing ten million different members (Figure 4B), which was named “Phagebody”. The engineered phage showed increased or altered host range and also could efficiently reduce the growth of both antibiotic- and phage-resistant bacterial strains in in vitro and in vivo mouse models [101][67]. As more and more viral tail and tail fiber components are structurally resolved and characterized, the breadth of viral models to which this “Phagebody” strategy can be used will also be extended.
As wthe sresearchers summarized in T4 LTF “tip” and receptor interaction, the majority of amino acid building blocks of the “tip” region are critical for bacterial receptor interaction. Likewise, in T7 phage, Huss et al. [102][68] demonstrated that the building blocks of the T7 receptor binding domain are important for efficiency and specificity. Even small changes to the receptor binding domain can make a big difference to the T7 phage’s ability to infect its hosts. They designed an ORACLE method (Optimized Recombination Accumulation and Library Expression) to create a T7 phage mutation library containing all single amino acid substitution (1660 variants) in the tip domain (Figure 4C). The role of each building block in the T7 receptor binding domain was meticulously dissected, and hundreds of function-enhancing substitutions were identified [102][68]. These studies help to understand exactly how the tail fiber allows a virus to infect a specific type of bacteria and could pave the way for fighting increasingly resistant bacterial infections.

2.3. Bioinformatic Prediction of Phage Host Range

The phage host range can be evaluated or predicted using a bioinformatic approach. Some phages and their hosts share common evolutionary ancestry; in other words, there is sequence homology between these phages and their hosts. As a result, it is easy to identify the host range of phages using common alignment tools such as BLAST (Basic Local Alignment Search Tool) [61][32]. If there is no significant sequence similarity between the phage and its host, other genomic features, such as codon usage, sequence composition, oligonucleotide frequency, and k-mer composition (k-mer is used to analyze nucleotide composition) can also be used to identify the host range. There are some tools available online to identify these features, such as VirHostMatcher, which identifies host range based on oligonucleotide frequency in a k-mer length [103][69]. Lastly, the advancement of high-throughput sequencing technology allows prediction of host range based on sequence features, CpG bias, and CG bias through machine learning algorithms [104,105][70][71].
In the machine learning approach, metagenomic datasets can be directly used to identify the phage receptor binding protein (tail fiber); in addition, it allows the identification of new receptor binding proteins [106][72]. Furthermore, using the metagenomic dataset, a targeted phage with an extended host range can be generated by swapping the tail or tail fiber [107][73]. Boeckaerts et al. developed a machine learning tool for predicting phage host specificity based on the annotated receptor binding protein sequence data [108][74]. From the raw DNA and RNA sequences, they generated 218 features, from which nucleotide frequency, TTA codon frequency, TTA codon usage bias, first Z scale descriptor (lipophilicity of amino acids), and GC content scored among the top five features for having higher influence on the prediction model. They evaluated the predictive performance of the model by using four machine learning methods. When they compared the final model with BLASTp, they reported the final model outperformed BLASTp when the sequencing similarity among other known sequences in the database was less than 75% [108][74] (Figure 5A).
Figure 5. Schematic representation of the machine learning model. (A) The raw data (DNA and RNA sequences) of the receptor binding protein database are used to extract features. Next, the features are fitted into different machine learning models, which are evaluated to predict the best result. (B) Representation of PERPHECT model; phage and bacterial genetic information are used by the PERPHECT model and PERPHECT generator to provide guidance for genomic modification of the existing phage [109][75]. Created by BioRender.com.
To systemically engineer the phage genome, deep learning (a subset of machine learning) can also be used. Ataee et al. proposed a two-component deep learning model: the first component is the PERFHECT model that predicts the interaction between bacteria and phages using a 1-D convolutional neural network (CNN); and the second component is the PERPHECT generator that alters the existing phage genome to enhance host range prediction (Figure 5B) [109][75]. In the predictor model, they used genomic information from both phages and bacteria to predict interactions. The predictor used 46 strains of Pseudomonas aeruginosa to predict the host range, and the generator model modifies 42 phages’ genomic sequences for precise host range engineering. The generator model’s training accuracy level is reasonably high (~96%), and it can enhance the host range of 18 out of 42 phages. This deep learning method allows systemic modification of the phage genome for host range engineering, which generates superior phage variants that overcome limitations due to bacterial resistance against natural phages.

References

  1. Walker, P.J.; Siddell, S.G.; Lefkowitz, E.J.; Mushegian, A.R.; Adriaenssens, E.M.; Alfenas-Zerbini, P.; Dempsey, D.M.; Dutilh, B.E.; García, M.L.; Curtis Hendrickson, R.; et al. Recent changes to virus taxonomy ratified by the International Committee on Taxonomy of Viruses (2022). Arch. Virol. 2022.
  2. Tétart, F.; Repoila, F.; Monod, C.; Krisch, H.M. Bacteriophage T4 Host Range is Expanded by Duplications of a Small Domain of the Tail Fiber Adhesin. J. Mol. Biol. 1996, 258, 726–731.
  3. Rao, V.B.; Black, L.W. Structure and assembly of bacteriophage T4 head. Virol. J. 2010, 7, 356.
  4. Chen, Z.; Sun, L.; Zhang, Z.; Fokine, A.; Padilla-Sanchez, V.; Hanein, D.; Jiang, W.; Rossmann, M.G.; Rao, V.B. Cryo-EM structure of the bacteriophage T4 isometric head at 3.3-Å resolution and its relevance to the assembly of icosahedral viruses. Proc. Natl. Acad. Sci. USA 2017, 114, E8184–E8193.
  5. Fokine, A.; Chipman, P.R.; Leiman, P.G.; Mesyanzhinov, V.V.; Rao, V.B.; Rossmann, M.G. Molecular architecture of the prolate head of bacteriophage T4. Proc. Natl. Acad. Sci. USA 2004, 101, 6003–6008.
  6. Zhu, J.; Tao, P.; Mahalingam, M.; Sha, J.; Kilgore, P.; Chopra, A.K.; Rao, V. A prokaryotic-eukaryotic hybrid viral vector for delivery of large cargos of genes and proteins into human cells. Sci. Adv. 2019, 5, eaax0064.
  7. Zhu, J.; Ananthaswamy, N.; Jain, S.; Batra, H.; Tang, W.-C.; Lewry, D.A.; Richards, M.L.; David, S.A.; Kilgore, P.B.; Sha, J.; et al. A universal bacteriophage T4 nanoparticle platform to design multiplex SARS-CoV-2 vaccine candidates by CRISPR engineering. Sci. Adv. 2021, 7, eabh1547.
  8. Zhu, J.; Jain, S.; Sha, J.; Batra, H.; Ananthaswamy, N.; Paul, B.K.; Emily, K.H.; Yashoda, M.H.; Wu, X.; Juan, P.O.; et al. A Bacteriophage-Based, Highly Efficacious, Needle- and Adjuvant-Free, Mucosal COVID-19 Vaccine. mBio 2022, 13, e01822-22.
  9. Miller, E.S.; Kutter, E.; Mosig, G.; Arisaka, F.; Kunisawa, T.; Ruger, W. Bacteriophage T4 genome. Microbiol. Mol. Biol. Rev. 2003, 67, 86–156.
  10. Wu, X.; Zhu, J.; Tao, P.; Rao, V.B. Bacteriophage T4 Escapes CRISPR Attack by Minihomology Recombination and Repair. mBio 2021, 12, e0136121.
  11. Comeau, A.M.; Bertrand, C.; Letarov, A.; Tétart, F.; Krisch, H.M. Modular architecture of the T4 phage superfamily: A conserved core genome and a plastic periphery. Virology 2007, 362, 384–396.
  12. Sun, L.; Zhang, X.; Gao, S.; Rao, P.A.; Padilla-Sanchez, V.; Chen, Z.; Sun, S.; Xiang, Y.; Subramaniam, S.; Rao, V.B.; et al. Cryo-EM structure of the bacteriophage T4 portal protein assembly at near-atomic resolution. Nat. Commun. 2015, 6, 7548.
  13. Fokine, A.; Zhang, Z.; Kanamaru, S.; Bowman, V.D.; Aksyuk, A.A.; Arisaka, F.; Rao, V.B.; Rossmann, M.G. The molecular architecture of the bacteriophage T4 neck. J. Mol. Biol. 2013, 425, 1731–1744.
  14. Kostyuchenko, V.A.; Chipman, P.R.; Leiman, P.G.; Arisaka, F.; Mesyanzhinov, V.V.; Rossmann, M.G. The tail structure of bacteriophage T4 and its mechanism of contraction. Nat. Struct. Mol. Biol. 2005, 12, 810–813.
  15. Letarov, A.; Manival, X.; Desplats, C.; Krisch, H.M. gpwac of the T4-type bacteriophages: Structure, function, and evolution of a segmented coiled-coil protein that controls viral infectivity. J. Bacteriol. 2005, 187, 1055–1066.
  16. Taylor, N.M.; Prokhorov, N.S.; Guerrero-Ferreira, R.C.; Shneider, M.M.; Browning, C.; Goldie, K.N.; Stahlberg, H.; Leiman, P.G. Structure of the T4 baseplate and its function in triggering sheath contraction. Nature 2016, 533, 346–352.
  17. Yap, M.L.; Klose, T.; Arisaka, F.; Speir, J.A.; Veesler, D.; Fokine, A.; Rossmann, M.G. Role of bacteriophage T4 baseplate in regulating assembly and infection. Proc. Natl. Acad. Sci. USA 2016, 113, 2654–2659.
  18. Arisaka, F.; Yap, M.L.; Kanamaru, S.; Rossmann, M.G. Molecular assembly and structure of the bacteriophage T4 tail. Biophys. Rev. 2016, 8, 385–396.
  19. Hyman, P.; van Raaij, M. Bacteriophage T4 long tail fiber domains. Biophys. Rev. 2017, 10, 463–471.
  20. Hu, B.; Margolin, W.; Molineux, I.J.; Liu, J. Structural remodeling of bacteriophage T4 and host membranes during infection initiation. Proc. Natl. Acad. Sci. USA 2015, 112, E4919–E4928.
  21. Goldberg, E. Recognition, Attachment, and Injection. In Molecular Biology of Bacteriophage T4; Mathews, C.K., Kutter, E.M., Mosig, G., Berget, P.B., Eds.; American Society of Microbiology: Washington, DC, USA, 1983; pp. 32–39.
  22. Padilla-Sanchez, V. Structural Model of Bacteriophage T4. Wikij. Sci. 2021, 4, 5.
  23. Freifelder, D.E. Molecular Biology, 2nd ed.; Science Books International: Boston, MA, USA, 1983; p. 614.
  24. Cerritelli, M.E.; Wall, J.S.; Simon, M.N.; Conway, J.F.; Steven, A.C. Stoichiometry and domainal organization of the long tail-fiber of bacteriophage T4: A hinged viral adhesin. J. Mol. Biol. 1996, 260, 767–780.
  25. Leiman, P.G.; Arisaka, F.; van Raaij, M.J.; Kostyuchenko, V.A.; Aksyuk, A.A.; Kanamaru, S.; Rossmann, M.G. Morphogenesis of the T4 tail and tail fibers. Virol. J. 2010, 7, 355.
  26. Bartual, S.G.; Otero, J.M.; Garcia-Doval, C.; Llamas-Saiz, A.L.; Kahn, R.; Fox, G.C.; van Raaij, M.J. Structure of the bacteriophage T4 long tail fiber receptor-binding tip. Proc. Natl. Acad. Sci. USA 2010, 107, 20287–20292.
  27. Dunne, M.; Prokhorov, N.S.; Loessner, M.J.; Leiman, P.G. Reprogramming bacteriophage host range: Design principles and strategies for engineering receptor binding proteins. Curr. Opin. Biotechnol. 2021, 68, 272–281.
  28. King, J.; Laemmli, U.K. Polypeptides of the tail fibres of bacteriophage T4. J. Mol. Biol. 1971, 62, 465–477.
  29. Hashemolhosseini, S.; Stierhof, Y.D.; Hindennach, I.; Henning, U. Characterization of the helper proteins for the assembly of tail fibers of coliphages T4 and lambda. J. Bacteriol. 1996, 178, 6258–6265.
  30. Nobrega, F.L.; Vlot, M.; de Jonge, P.A.; Dreesens, L.L.; Beaumont, H.J.E.; Lavigne, R.; Dutilh, B.E.; Brouns, S.J.J. Targeting mechanisms of tailed bacteriophages. Nat. Rev. Microbiol. 2018, 16, 760–773.
  31. Islam, M.Z.; Fokine, A.; Mahalingam, M.; Zhang, Z.; Garcia-Doval, C.; van Raaij, M.J.; Rossmann, M.G.; Rao, V.B. Molecular anatomy of the receptor binding module of a bacteriophage long tail fiber. PLoS Pathog. 2019, 15, e1008193.
  32. Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 1990, 215, 403–410.
  33. Wang, C.; Tu, J.; Liu, J.; Molineux, I.J. Structural dynamics of bacteriophage P22 infection initiation revealed by cryo-electron tomography. Nat. Microbiol. 2019, 4, 1049–1056.
  34. Hu, B.; Margolin, W.; Molineux, I.J.; Liu, J. The bacteriophage t7 virion undergoes extensive structural remodeling during infection. Science 2013, 339, 576–579.
  35. North, O.I.; Sakai, K.; Yamashita, E.; Nakagawa, A.; Iwazaki, T.; Büttner, C.R.; Takeda, S.; Davidson, A.R. Phage tail fibre assembly proteins employ a modular structure to drive the correct folding of diverse fibres. Nat. Microbiol. 2019, 4, 1645–1653.
  36. Garcia-Doval, C.; van Raaij, M.J. Structure of the receptor-binding carboxy-terminal domain of bacteriophage T7 tail fibers. Proc. Natl. Acad. Sci. USA 2012, 109, 9390–9395.
  37. Subramanian, S.; John, A.D.; Kristin, N.P.; Sarah, M.D.; Rebecca, E.D. Host Range Expansion of Shigella Phage Sf6 Evolves through Point Mutations in the Tailspike. J. Virol. 2022, 96, e0092922.
  38. Šiborová, M.; Füzik, T.; Procházková, M.; Nováček, J.; Benešík, M.; Nilsson, A.S.; Plevka, P. Tail proteins of phage SU10 reorganize into the nozzle for genome delivery. Nat. Commun. 2022, 13, 5622.
  39. Linares, R.; Arnaud, C.-A.; Effantin, G.; Darnault, C.; Epalle, N.H.; Erba, E.B.; Schoehn, G.; Breyton, C. Structural basis of bacteriophage T5 infection trigger and E. coli cell wall perforation. bioRxiv 2022.
  40. Raetz, C.R.H.; Whitfield, C. Lipopolysaccharide Endotoxins. Annu. Rev. Biochem. 2002, 71, 635–700.
  41. Bertani, B.; Ruiz, N. Function and Biogenesis of Lipopolysaccharides. EcoSal Plus 2018, 8.
  42. Washizaki, A.; Yonesaki, T.; Otsuka, Y. Characterization of the interactions between Escherichia coli receptors, LPS and OmpC, and bacteriophage T4 long tail fibers. Microbiologyopen 2016, 5, 1003–1015.
  43. Kalynych, S.; Morona, R.; Cygler, M. Progress in understanding the assembly process of bacterial O-antigen. FEMS Microbiol. Rev. 2014, 38, 1048–1065.
  44. Schnaitman, C.A.; Klena, J.D. Genetics of lipopolysaccharide biosynthesis in enteric bacteria. Microbiol. Rev. 1993, 57, 655–682.
  45. Moore, R.N.; Amor, K.; David, E.H.; Frirdich, E.; Ziebell, K.; Roger, P.J.; Whitfield, C. Distribution of Core Oligosaccharide Types in Lipopolysaccharides from Escherichia coli. Infect. Immun. 2000, 68, 1116–1124.
  46. Suga, A.; Kawaguchi, M.; Yonesaki, T.; Otsuka, Y. Manipulating Interactions between T4 Phage Long Tail Fibers and Escherichia coli Receptors. Appl. Environ. Microbiol. 2021, 87, e0042321.
  47. Pettersen, E.F.; Goddard, T.D.; Huang, C.C.; Meng, E.C.; Couch, G.S.; Croll, T.I.; Morris, J.H.; Ferrin, T.E. UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Sci. 2021, 30, 70–82.
  48. Rosas, N.C.; Lithgow, T. Targeting bacterial outer-membrane remodelling to impact antimicrobial drug resistance. Trends Microbiol. 2022, 30, 544–552.
  49. Vergalli, J.; Bodrenko, I.V.; Masi, M.; Moynié, L.; Acosta-Gutiérrez, S.; Naismith, J.H.; Davin-Regli, A.; Ceccarelli, M.; van den Berg, B.; Winterhalter, M.; et al. Porins and small-molecule translocation across the outer membrane of Gram-negative bacteria. Nat. Rev. Microbiol. 2020, 18, 164–176.
  50. Rakhuba, D.V.; Kolomiets, E.I.; Dey, E.S.; Novik, G.I. Bacteriophage receptors, mechanisms of phage adsorption and penetration into host cell. Pol. J. Microbiol. 2010, 59, 145–155.
  51. Yu, S.L.; Ko, K.L.; Chen, C.S.; Chang, Y.C.; Syu, W.J. Characterization of the distal tail fiber locus and determination of the receptor for phage AR1, which specifically infects Escherichia coli O157:H7. J. Bacteriol. 2000, 182, 5962–5968.
  52. Labrie, S.J.; Samson, J.E.; Moineau, S. Bacteriophage resistance mechanisms. Nat. Rev. Microbiol. 2010, 8, 317–327.
  53. Altamirano, F.G.; Forsyth, J.H.; Patwa, R.; Kostoulias, X.; Trim, M.; Subedi, D.; Archer, S.K.; Morris, F.C.; Oliveira, C.; Kielty, L.; et al. Bacteriophage-resistant Acinetobacter baumannii are resensitized to antimicrobials. Nat. Microbiol. 2021, 6, 157–161.
  54. Wang, X.; Loh, B.; Altamirano, F.G.; Yu, Y.; Hua, X.; Leptihn, S. Colistin-phage combinations decrease antibiotic resistance in Acinetobacter baumannii via changes in envelope architecture. Emerg. Microbes Infect. 2021, 10, 2205–2219.
  55. Goodsell, D.S. Escherichia coli. Biochem. Mol. Biol. Educ. 2009, 37, 325–332.
  56. Rossmann, M.G.; Mesyanzhinov, V.V.; Arisaka, F.; Leiman, P.G. The bacteriophage T4 DNA injection machine. Curr. Opin. Struct. Biol. 2004, 14, 171–180.
  57. Hampton, H.G.; Watson, B.N.J.; Fineran, P.C. The arms race between bacteria and their phage foes. Nature 2020, 577, 327–336.
  58. Dy, R.L.; Richter, C.; Salmond, G.P.C.; Fineran, P.C. Remarkable Mechanisms in Microbes to Resist Phage Infections. Annu. Rev. Virol. 2014, 1, 307–331.
  59. Habusha, M.; Tzipilevich, E.; Fiyaksel, O.; Ben-Yehuda, S. A mutant bacteriophage evolved to infect resistant bacteria gained a broader host range. Mol. Microbiol. 2019, 111, 1463–1475.
  60. Pawluk, A.; Davidson, A.R.; Maxwell, K.L. Anti-CRISPR: Discovery, mechanism and function. Nat. Rev. Microbiol. 2018, 16, 12–17.
  61. Srikant, S.; Guegler, C.K.; Laub, M.T. The evolution of a counter-defense mechanism in a virus constrains its host range. eLife 2022, 11, e79549.
  62. Molineux, I.J. The T7 group. Bacteriophages 2005, 2, 277–301.
  63. Burrowes, B.H.; Molineux, I.J.; Fralick, J.A. Directed in vitro evolution of therapeutic bacteriophages: The Appelmans protocol. Viruses 2019, 11, 241.
  64. Yoichi, M.; Abe, M.; Miyanaga, K.; Unno, H.; Tanji, Y. Alteration of tail fiber protein gp38 enables T2 phage to infect Escherichia coli O157: H7. J. Biotechnol. 2005, 115, 101–107.
  65. Ando, H.; Lemire, S.; Pires, D.P.; Lu, T.K. Engineering modular viral scaffolds for targeted bacterial population editing. Cell Syst. 2015, 1, 187–196.
  66. Yosef, I.; Goren, M.G.; Globus, R.; Molshanski-Mor, S.; Qimron, U. Extending the host range of bacteriophage particles for DNA transduction. Mol. Cell 2017, 66, 721–728.e3.
  67. Yehl, K.; Lemire, S.; Yang, A.C.; Ando, H.; Mimee, M.; Torres, M.D.T.; de la Fuente-Nunez, C.; Lu, T.K. Engineering phage host-range and suppressing bacterial resistance through phage tail fiber mutagenesis. Cell 2019, 179, 459–469.e9.
  68. Huss, P.; Meger, A.; Leander, M.; Nishikawa, K.; Raman, S. Mapping the functional landscape of the receptor binding domain of T7 bacteriophage by deep mutational scanning. eLife 2021, 10, e63775.
  69. Ahlgren, N.A.; Ren, J.; Lu, Y.Y.; Fuhrman, J.A.; Sun, F. Alignment-free oligonucleotide frequency dissimilarity measure improves prediction of hosts from metagenomically-derived viral sequences. Nucleic Acids Res. 2017, 45, 39–53.
  70. Versoza, C.J.; Pfeifer, S.P. Computational Prediction of Bacteriophage Host Ranges. Microorganisms 2022, 10, 149.
  71. Meng, C.; Zhang, J.; Ye, X.; Guo, F.; Zou, Q. Review and comparative analysis of machine learning-based phage virion protein identification methods. Biochim. Biophys. Acta (BBA)-Proteins Proteom. 2020, 1868, 140406.
  72. Cantu, V.A.; Salamon, P.; Seguritan, V.; Redfield, J.; Salamon, D.; Edwards, R.A.; Segall, A.M. PhANNs, a fast and accurate tool and web server to classify phage structural proteins. PLoS Comput. Biol. 2020, 16, e1007845.
  73. Dunne, M.; Rupf, B.; Tala, M.; Qabrati, X.; Ernst, P.; Shen, Y.; Sumrall, E.; Heeb, L.; Plückthun, A.; Loessner, M.J.; et al. Reprogramming Bacteriophage Host Range through Structure-Guided Design of Chimeric Receptor Binding Proteins. Cell Rep. 2019, 29, 1336–1350.e4.
  74. Boeckaerts, D.; Stock, M.; Criel, B.; Gerstmans, H.; De Baets, B.; Briers, Y. Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins. Sci. Rep. 2021, 11, 1467.
  75. Ataee, S.; Brochet, X.; Peña-Reyes, C.A. Bacteriophage Genetic Edition using LSTM. Front. Bioinform. 2022, 73, 932319.
More
ScholarVision Creations