Although
Legionella is well known as an intracellular pathogen of macrophages, it is originally a parasite of environmental protozoa. The intracellular replication within protozoa was first described in
L. pneumophila in 1980 [
33]. Studies on the interaction of
L. pneumophila with environmental amoebae demonstrated that
Acanthamoeba spp. [
34,
35,
36],
Naegleria spp. [
37],
Hartmannella spp. and
Balamuthia mandrillaris support the intracellular growth of
L. pneumophila [
38]. Until now, at least 20 species of amoebae, two species of ciliated protozoa, and one species of
Slime mould have been identified as potential environmental hosts for
Legionella spp. [
39]. As a group of facultatively intracellular parasites,
Legionella has developed a dependent relationship with various protozoa that shelters them from stresses in harsh or low-nutrient environments. Indeed, when nutrients are scarce, free-living protozoa can support
Legionella multiplication and help resuscitate viable non-culturable
L. pneumophila after disinfection [
40,
41]. Protozoa are environmental hosts of
Legionella species; they share many commonalities with mammalian phagocytes in microbicidal mechanisms, such as uptake and intracellular trafficking of
Legionella, and they have been proposed as training grounds where parasitic microorganisms can exchange genes across species or with the amoeba hosts [
42]. It was speculated that a long-term evolutionary connection with an amoeba host facilitates
Legionella’s development of virulence strategies applied to escape routine digestive pathways in mammalian phagocytes [
43]. Humans may be an evolutionary dead end and play little role in
Legionella transmission, as the majority of cases of the disease had an environmental source.
Legionella can be easily consumed in the environment by free-living amoebae that feed on the biofilm containing
Legionella and other multiple microorganisms [
44]. Protozoa uptake bacteria through traditional or coiling phagocytosis and confine them to phagosomes that subsequently fuse with lysosomes. Undoubtedly,
Legionella has evolved many clever strategies to escape from host digestion and then furtively replicate within the protozoan predator. The strategies include disrupting endocytic transport, preventing vacuolar acidification, and intervening in host metabolic pathways, such as protein translation and ubiquitination, phosphoinositide lipid metabolism, and so on [
45,
46,
47]. Transmission electron microscopy showed that the
L. pneumophila-containing phagosomes did not enter the routine digestive pathway and fuse with lysosomes. On the contrary, they recruit components of the rough endoplasmic reticulum and mitochondrion to build
Legionella-containing vesicles (LCV) to replicate within host cells [
48]. The infection process is mainly composed of five parts: bacterial uptake, the establishment of the LCV, intracellular multiplication, host response, and bacterial release [
48]. The same procedure also takes place when it invades human cells, and the same several secretion systems are involved [
49]. It is considered that the ability of
Legionella to invade human macrophages is a consequence of its long-term adaptation to intracellular survival and multiplication in protozoa, as in the process of infection and survival within macrophages,
Legionella shares some common mechanisms and genes with environmental hosts, such as effectors of Icm/Dot secretion system [
50]. Therefore, human macrophages are assumed to be an unexpectedly accidental host in the long common evolutionary history of
Legionella.
Many analyses of the
Legionella genome have given insight into the signs of co-evolution between
Legionella and protozoa. Firstly,
Legionella spp. genomes have markedly high plasticity and diversity. Comparative genomics analysis of several
L. pneumophila strains revealed that ~3000 proteins are coded and nearly 300 genes (10%) are specific for each strain [
51]. This is unexpected for the same bacterial species. There are higher genomic divergences among different
Legionella species. For example,
L. longbeachae has 34.8% more specific genes compared with
L. pneumophila strain
Paris,
Lens,
Philadelphia, and
Corby, and only 65.2% of genes are orthologous to them [
52]. Genome sequence analysis based on 58 species showed that the genome size and GC content of
Legionella are highly diverse, with varied ranges from 2.37 Mb to 4.88 Mb (
L. adelaidensis to
L. santicrucis) and 34.82% to 50.93% (
L. busanensis to
L. geestiana), respectively. Up to 32% of 5832 orthologous genes are strain-specific, and genes in the core genome account for only 6%, which is similar to the result obtained by analyzing 38
Legionella species [
53,
54]. The highly dynamic nature of these genomes implies that the
Legionella genome can obtain genetic materials via other means than just vertical inheritance. Secondly, a high number of conserved eukaryotic-like proteins and virulence genes that functionally mimic host cell proteins were identified in the genome of
L. pneumophila but were more conserved or absent in non-
L. pneumophila [
51,
55,
56]. Some of these proteins, which were found in more than two-thirds of the 58 analyzed species, contain F-box and U-box domains that have roles in interfering with eukaryotic ubiquitination machinery and regulating the proteasomal degradation pathway [
53,
57]. Another interesting finding is that some proteins have ankyrin-repeat, which targets the plasma membrane or the ER of eukaryotic cells, and is essential for the intracellular proliferation of
Legionella in macrophages and protozoa [
58,
59]. Some eukaryote-homologous enzymes involved in metabolism and signaling pathways, such as Rho-, Ras-, or Rab-like proteins and phospholipases, are present in the genome of
Legionella as well [
53]. They are highly conserved among
L. pneumophila genomes [
60]. In addition, a vast majority of effectors translocated by the Dot/Icm type IV secretion system (Dot/Icm T4SS) are eukaryotic-homologous or have eukaryotic-like domains. All of them play important roles in the intracellular replication of human macrophages and protozoa [
61]. Although eukaryotic-like proteins have been identified in many other bacterial pathogens,
L. pneumophila has been shown to encode the largest and most extensive variety of eukaryotic-like proteins or proteins with eukaryotic-homologous domains [
53]. A large number of eukaryotic-like proteins provide the bacteria with high versatility and the ability to adapt to intracellular conditions of eukaryotic cells, allowing them to infect a variety of hosts such as amoebas and humans.
The inconsistent GC content between eukaryotic-homologous proteins and genomes implies that horizontal gene transfer (HGT) from protozoa to
Legionella may exist [
43]. Moliner et al. have given an instance of gene interchange between
Dictyostelium discoideum and
L. drancourtii through a phylogenetic analysis of the malatesynthase, which shows the two are clustered with a bootstrap value of 88% [
62]. Structural studies confirmed that a series of effector proteins secreted by Dot/Icm T4SS are acquired through interdomain HGT. An obvious example is RalF, an ADP-ribosylation factor guanine nucleotide exchange factor (GEF) transported through the type IV secretion apparatus, which comprises a Sec7 domain homologous to mammalian [
63], and helps recruit Arf to
Legionella-containing phagosomes for the establishment of a replicative organelle, LCV. Felipe et al. screened 62 eukaryotic-like genes of
Legionella and found a considerable number of Dot/Icm T4SS substrates, with some eukaryotic domains such as ankyrin repeats, Leucine-rich repeats, F-/U-box, Ser/Thr kinase, coiled coils, etc. [
56]. Another interesting example demonstrating the existence of HGT is the presence of sphingolipid metabolic enzymes, including sphingomyelinase, sphingosine kinase, and sphingosine-1 phosphate lyase in
L. pneumophila genome. Sphingolipids are major components of eukaryotic cellular membranes and are generally conserved among all eukaryotes, but these enzymes appear to be evolutionarily conserved among several
Legionella species [
64,
65]. All these examples illustrate that HGT occurs from protozoans to
Legionella, and protozoans are the most predominant donors of eukaryotic-like proteins for
Legionella, but they are not the only ones. Gene transfer via HGT occurs not only from protozoa to
Legionella, but also among different
Legionella species. A regulator-effectors island encoding a LuxR type regulator RegK3 and two Do/Icm T4SS effectors LegK3 and CegK3 was proved the presence of HGT among the genus
Legionella, since the regK3-legK3-cegK3 island presented in different species as a whole and no single
Legionella or
L. pneumophila strain was found to harbor only one or two of these genes, but not all strains in the same species contain this genomic island. The regK3-legK3-cegK3 island shows a different phylogenetic tree topology within the
Legionella species and has a lower GC content compared with the genomic GC content of the
Legionella species [
66]. The F-type T4ASS encodes a complete T4SS core as well as the necessary protein for pilus assembly and mating pair stabilization, shows homology and collinearity with the tra-region in
Escherichia coli F plasmid and
Rickettsia belii. The tra-region in the
L. pneumophila strain
Paris plasmid (Tra1) shows the most identity with that located on the
L. longbeachae plasmid (Tra4) when compared with other
L. pneumophila strains [
60]. In addition,
L. pneumophila also encodes proteins homologous to viruses that can infect amoeba and amoeba-associated bacteria [
67]. HGT is a crucial way of providing eukaryotic-like proteins to
Legionella. It occurs not only between the same or closely related species but also between different domains of the organism. Phylogenetic analysis on the evolutionary origin of eukaryotic-like proteins in
L. pneumophila suggested that a portion of these proteins were acquired from eukaryotes, while several proteins containing a typically eukaryotic domain pertained to bacterial phylogeny [
68]. An evolutionary dissection based on comparative genomics for Dot/Icm T4SS proteins suggested that both recombination and natural positive/negative selection are evolutionary forces that shape the diversity of Dot/Icm T4SS effectors [
69,
70]. As a result, lateral gene transfer and gradually convergent mutation that adapts to intracellular conditions both contribute to the pooling of eukaryotic-like proteins.
3. A Supplement to the Concept of Virulence Factors in Legionella
Virulence factors have always been defined as effectors and proteins contributing to any process of infection of human macrophages, whether in the bacterial invasion, hijacking vesicle transport, evasion autophagy, or replication. This definition originated from the perspective of humans and intends to understand the pathogenic mechanisms of
Legionella [
71]. However, it appears that a common concept for environmental and accidental hosts is required, as these factors are not only involved in the infection of human macrophages but also in protozoa natural parasitization. Some contradictory things happened when those non-pathogenic
Legionella species in humans also parasitizes protozoa in the environments, and effectors involved in or required in the infection of protozoa may partially overlap with those in pathogenic species [
3].
Legionella has obtained a large number of redundant proteins that perform the same functions [
72], which means some effectors execute parallel functions in different hosts, or there are many candidates to perform the same functions in one host [
73]. Consequently, a complicated problem arises: how to clearly distinguish the range of virulence factors? More importantly, it is noteworthy that the pathogenicity of virulent
L. pneumophila may be determined by elements other than the so-called virulence-related factors, such as viability or vitality, stability, and stress resistance [
74]. In other words, a
Legionella strain may be non-virulent or weakly virulent to humans even if it expresses all the virulence factors that are required to infect humans. Here, we take several traditional virulence factors, including components and effectors from secretion apparatuses as well as surface proteins as examples to discuss the role of these virulence factors in natural hosts and humans, and hopefully to propose a more specific definition of virulence factors.
The pathogenicity of
Legionella is manifested as host cell invasion and building LCV for robust intracellular replication. The pathogenesis involves attachment to host cells and modulation of host metabolism, vesicle trafficking, host autophagy, protein translation, and degradation. Proteins required in these procedures are recognized as virulence factors, mainly including some structures and proteins on the surface of bacteria and effectors translocated by five secretion systems (the Dot/Icm T4SS, Lsp type II (T2SS), Tat, Lss type I (T1SS), and Lvh type IVA secretion systems (T4ASS) [
28]. The five secretion systems’ apparatuses play important roles in the life cycle of
Legionella. The Dot/Icm T4SS apparatus which localizes in the polar regions of bacterial cells and is composed of about 27 components, is recognized as a key virulence factor required for intracellular bacterial survival. It participates in almost the whole infection process after bacterial entry, including subverting vesicle trafficking, establishing the LCV, multiplication, inhibiting host cell apoptosis, and promoting bacterial release from host cells. The Dot/Icm T4SS is proposed to be particularly crucial for virulence-related phenotypes in the infection of human macrophages and amoebae because nearly 300 effector proteins that are essential for the regulation of host physiological and biochemical activities are translocated by it [
28]. As a supplement to the Dot/Icm T4SS, the Lvh T4ASS conditionally works in bacterial entry and intracellular multiplication while delaying phagosome acidification [
75]. The Lsp T2SS transports proteins from the periplasm to the extracellular space. Unfolded proteins are first translocated across the bacterial inner membrane by the general secretory (Sec) pathway, then secreted to extracellular space by T2SS after folding into a tertiary conformation. Some proteins that are folded within the cytoplasm and transported across the bacterial inner membrane via the twin-arginine translocation (Tat) system can also be secreted by the T2SS apparatus [
76]. The Tat system contributes to biofilm formation and intracellular multiplication of
Legionella. The Lss sT1SS is encoded by the lssXYZABD locus, among which the
lssB and
lssD genes encode the ABC transporter and membrane fusion protein, respectively. Although the relationship between T1SS and host-pathogen interactions has not been elaborated,
lssB and
lssD genes were demonstrated to be required for intracellular replication, as Δ
lssBD mutant led to reduced internalization, replication delay and comprised cytotoxicity in amoebas [
77]. The secreation systems’ apparatuses are defined as “virulence factors” due to their basic roles in infecting various hosts, however, they participate in both infection of natural and accidental hosts and play essential roles in the life cycle of
Legionella. The secretion apparatus’s core components are essential for survival and necessities in the
Legionella lifecycle.
Besides the core secretion apparatuses, effectors secreted by these systems are also required for the infection of natural hosts and accidental hosts. Among them, Dot/Icm T4SS and Lsp T2SS are the primary and essential apparatuses for invasion and intracellular replication. Most effectors secreted by the two secretion systems have been explored in terms of function and pathogenic implications and have been summarized in many articles [
27,
28,
45,
78]. Although some effectors play important roles in the infection of human macrophages, they may be indispensable for survival relying on the infection of natural hosts. Some
Legionella with effector gene mutants presents intracellular growth defects in both human cells and protozoa. A new substrate, RavY translocated by Dot/Icm T4SS, was characterized as an important effector for maintaining intracellular replication of
L. pneumophila in mammalian cells [
79], and also functions in
L. pneumophila replication within protozoan hosts [
80]. The protein SdhA maintains the integrity of LCV to prevent cytoplasmic degradation and host cell death by blocking the action of inositol 5-phosphatase OCRL and controlling endosomal dynamics [
81,
82], which is necessary for virulence because a damaged LCV membrane will expose the bacteria to the cytoplasm, and activate cell pyroptosis as well as premature termination of the bacterial replication process. However, it can be easily speculated that SdhA undertakes the same functions essential for living in protozoa, given that
Legionella escapes the endocytosis pathways of protozoa and macrophages, though a phenotype defect of Δ
sdhA mutant in protozoa has not been reported. LCVs with the MavE effector mutant are unable to hijack ER-derived vesicles and fuse with lysosomes, resulting in defective growth in human monocyte-derived macrophages and amoebae and aborted intrapulmonary proliferation in mice [
83]. SidJ is also a substrate of the Dot/Icm T4SS and is necessary for recruiting endoplasmic reticulum (ER) proteins and incorporating ER-generated vesicles, which are constantly expressed during the entire life cycle of
Legionella [
84]. Moreover, LegC2, LegC3, and LegC7 effectors with functions similar to SidJ have been described as “SNARE-like” proteins engaging in ER-derived membrane recruitment and fusion, but not participating in bacterial replication [
85]. The DrrA protein targets Rab1 on the plasma membrane-derived vacuoles by displacing Rab-GDI to recruit vacuoles containing
Legionella [
86]. Undoubtedly, these proteins are essential to LCV building in both protozoa and macrophages. In addition, the gene
pmiA outside the icm/dot loci was defined as a virulence factor of
L. pneumophila, involved in the survival and replication of
L. pneumophila in human macrophages and protozoa through avoiding the acquisition of the late endosomal-lysosomal markers LAMP-1 and LAMP-2, and mainly contributing to the latter [
30], thus, it is more suitable for these elements to be called survival factors than virulence factors. The effectors stated above involve both protozoa and human infection; nevertheless, infecting natural hosts is their original and primary feature; their pathogenicity to humans is an unexpected result of their evolution in order to survive in diverse environmental hosts and accidentally adapt to the intracellular environment of macrophages. And infecting humans is not the primary way of life for
Legionella but the point of death. Therefore, to some extent, they should not be defined as virulence factors that are specific to humans.
Many surface proteins involved in invading host cells are also identified as virulence factors through genomic DNA library screening, gene complementation experiments, and defective phenotypes of mutants observed after the infection of macrophages or amoebae [
87]. Protozoa and macrophages are valuable cellular models commonly used in exploring the functions of
Legionella proteins. For instance,
rtxA and
enhC loci were screened by transposon mutagenesis with significantly reduced entry into host cells compared with wild-type strains [
87]. RtxA is homologous to repeats structural toxin protein secreted by T1SS, is associated with cytotoxic activity on the bacterial surface, and is required for bacterial entry via binding β2 integrins [
88,
89]. EnhC is a secreted protein with Sel1 repeat (SLR) motifs that can interact with eukaryotic proteins possessing immunoglobulin-like folds. The
rtxA gene was demonstrated to be virulence-related by an in-frame deletion in the gene, and its product is a modular multifunctional protein involved in multiple infection steps, including adherence of the host, cytotoxicity, pore formation, intracellular growth, damage to mice, and especially cell entry [
90,
91]. Similar functions of
rtxA could also be observed when using the environmental host
Acanthamoeba castellanii as a cellular model [
92], suggesting that
rtxA plays a role in cell entry and replication in both macrophages and protozoa. In contrast, the
enhC locus was demonstrated to be dispensable for uptake into host cells and required for robust replication within macrophages but not
Dictyostelium discoideum, which may result from the EnhC protein releasing innate immune pressure from infected macrophages, since soluble factors secreted by infected monolayers restrict intracellular growth of Δ
enhC L. pneumophila, and the growth restriction caused by low concentrations of recombinant mouse TNF-α could be compensated by a plasmid-encoded EnhC [
93]. Similarly, LpnE is another protein with SLR regions, it can moderately enhance infection in both protozoa and mouse models and regulate vesicle trafficking for the avoidance of LAMP-1. But this protein seems to play a more important role in the infection of macrophages because LpnE mutant
L. pneumophila showed significantly attenuated multiplication in the lungs of A/J mice after a period of infection, which may be caused by compromised intrusion and avoidance of lysosomal digestion, or more likely by the weakened resistance to innate immunity in mice like those infected with Δ
enhC L. pneumophila, because comparable bacterial numbers were obtained by either the LpnE mutant or the wild type strain 72 h after amoeba infection [
94]. EnhC participates in immune escape and the persistent survival of
Legionella against human macrophages but not protozoa, while LpnE plays a pivotal role in the continuous growth and proliferation of
L. pneumophila in A/J mice but not in amoeba. Therefore, they are more specific virulence factors for susceptible mammals, such as humans, but not necessary factors for infecting protozoa. Other bacterial surface proteins, such as flagella and type IV pilin, play equally important roles in adherence to human cells and protozoa, as well as foreign antigens involved in routine immunity [
95]. As an adhesion molecule, the major outer membrane protein (MOMP) helps
L. pneumophila bind to macrophages and initiate an immune response [
96]. Although the above three proteins (EnhC, LpnE, and MOMP) assist bacterial entry, they act as conventional antigens causing a host immune response; they are not virulence factors specific to humans. The most typically pathogenicity-related protein on the bacterial surface is the macrophage infectivity potentiator (Mip), a Peptidyl-prolyl-cis/trans-isomerase (PPIase), which can bind to collagen IV and change peptidyl-prolyl bonds to help bacteria transmigrate through the barrier of extracellular matrix (ECM) and NCI-H292 lung epithelial cells [
97]. Although the Mip protein increases bacterial infectivity to protozoa and macrophages [
98], the molecular structures involved are not the same. The N-terminal and dimerization of Mip are necessary for optimal virulence in amoebas, while PPIase activity is significantly important for the infection of animal models [
99]. Despite having a role in adaptation to intracellular growth within protozoa, Mip has evolved a specific molecular mechanism for infecting animals or humans.
In conclusion, because of the similarities in bacterial uptake and digestion between protozoa and macrophages, strategies designed to overcome protozoan defenses can be reasonably used to combat the innate immune mechanism in bacteria toward macrophages, and defining proteins that contribute to bacterial entry into eukaryotic cells and intracellular replication in both humans and protozoa as virulence factors may be confusing. The difference between protozoa and human macrophages is whether they have a sophisticated immune system to support them against invasive germs. As a unicellular organism, protozoa only use phagocytosis to resist foreign bacteria, which is a low-level, crude, and ineffective immune strategy that could be easily broken through. On the contrary, the human macrophage is a member of a highly developed immune system with extensive and potent defense capabilities as well as an advanced regulatory mechanism. Thus, the primary distinction between bacteria-protozoa interaction and bacteria-macrophage interaction is that bacterial infection of macrophages involves more complex interactions between bacteria and the host immune system. Those molecules (e.g., proteins) that aid intracellular survival and replication of Legionella by modulating the host immune response can be defined as virulence factors specific to humans. Furthermore, considering human-specific virulence variables in terms of immunological modulation may be more relevant.