Rhabdoviridae is the most diverse family of the negative, single-stranded RNA viruses, which includes 40 ecologically different genera that infect plants, insects, reptiles, fishes, and mammals, including humans, and birds. To date, only a few bird-related rhabdoviruses among the genera Sunrhavirus, Hapavirus, and Tupavirus have been described and analyzed at the molecular level.
1. Introduction
Rhabdoviruses are enveloped RNA viruses belonging to the order
Mononegavirales. They are characterized by a bullet or rod shape and contain a single or segmented molecule of linear negative-strand RNA of a size approximately 10−16 kb, which contains the five canonical genes encoding the nucleoprotein (N), the phosphoprotein (P), the matrix protein (M), the glycoprotein (G), and the RNA-dependent RNA polymerase (L)
[1][2]. Moreover, various novel and diverse accessory genes or putative open reading frames (ORFs) overlap these genes or are interspersed between them
[3]. Each gene and some of the accessory ORFs are flanked by relatively well-conserved transcription initiation (TI) and transcription termination polyadenylation (TTP) sequences
[1][3].
The family
Rhabdoviridae is the most diverse within the
Mononegavirales, with 40 different genera and 246 species according to the latest update by the International Committee on Taxonomy of Viruses (ICTV: https://talk.ictvonline.org/, accessed on 25 September 2021)
[2]. The members of this family exhibit a large ecological diversity, with pathogens infecting various plants or animals, including mammals, such as livestock and humans, insects, fishes, reptiles, and birds. Some of them have significant public health, livestock, aquaculture, and agricultural impacts
[4]. However, dozens of putative or unclassified new species are waiting to be assigned to potential new genera in the near future. Early taxonomy of these viruses was based on virion morphology and serological cross-reactivity. Thus, many unclassified rhabdoviruses had been assigned to certain taxa, according to serological cross-reactivity with some typical members of the same rhabdovirus genus or group. For instance, Tupaia virus (TUPV) and Klamath virus (KLAV) had been related to
Vesiculovirus genus
[5][6][7], but subsequent molecular analysis definitively classified them into the new and distinct
Tupavirus genus
[1]. Similarly, the previously uncharacterized Duvenhage lyssavirus (DUVV), Lagos bat lyssavirus (LBV), Mokola lyssavirus (MOKV), European bat lyssavirus 1 (EBLV-1), and European bat lyssavirus 2 (EBLV-2) were initially classified into rabies-related viruses by serological test before being considered as individual species among the genus
Lyssavirus [8]. As the development of molecular and sequencing techniques have become more and more efficient and available, genotyping is now considered to be a key element in viral taxonomy
[9][10].
Until now, 12 bird-related viruses were encompassed within the family
Rhabdoviridae, in three different genera, namely
Tupavirus,
Hapavirus, and
Sunrhavirus. Most of them were identified in different bird species in Africa, with the exception of Durham virus (DURV), which was isolated from
Fulica americana in North America in 2005
[11]. This virus is the only bird-related rhabdovirus within the
Tupavirus genus. Similarly, Landjia hapavirus (LJAV), isolated from
Riparia paludicola in the Central African Republic in 1970, is the unique representative of bird-related viruses among the
Hapavirus genus
[1]. Conversely, the new taxonomic genus
Sunrhavirus is associated to a significant degree with birds, and includes Sunguru virus (SUNV), which is described in domestic chicken in Uganda in 2013
[12], Garba virus (GARV), which is reported in
Corythornis cristata in Bangui in the Central African Republic in 1970
[1], as well as the unclassified rhabdovirus, Farmington virus (FARV), also identified in wild bird in North America in 1969
[13].
In addition to these characterized viruses, other bird rhabdoviruses are still waiting for taxonomic assignation, and include five viruses collected in the Central African Republic with the Bimbo (BBOV), Kolongo (KOLV), Nasoule (NASV), Ouango (OUAV), and Sandjimba (SAJV) viruses, and two rhabdoviruses collected in Egypt called the Matariya (MTYV) and Burg el Arab (BEAV) viruses. At the time of their collection, initial serological analysis evidenced links to rabies virus groups or bovine ephemeral fever ephemerovirus groups
[5][14][15]. Subsequent studies based on a phylogenetic analysis of limited regions among the N and L genes repositioned them into the unclassified Sandjimba group
[16][17][18][19]. Later, the members of the Sandjimba group, which were most closely related to several bird- or insect-associated rhabdoviruses which originated in America and Africa, were assigned to the new rhabdoviruses genus
Sunrhavirus [1][12][14][16][17][18][19].
2. Genome Characterization of the Bird-Related Rhabdoviruses
The determination of the genome sequences of the seven bird-related rhabdoviruses was performed using NGS. Between 0.1 to 12 million raw reads were obtained per sample (around 6 million reads on average) (Table 1). Any remaining gaps and low coverage regions were resolved through specific PCR or nested-PCR and Sanger sequencing of the corresponding amplicons. Each final consensus sequence was then used as a reference sequence for a last mapping round for final verification (Table 1). Nearly complete genome sequences (without the 3′ leader and 5′ trailer sequences) were obtained for all of the seven viruses, which ranged from 10,805 to 11,021 nt in length. The average coverage for each sequence varied from 10x to 600x (Table 1, and Supplementary Materials, Figure S1).
Table 1. NGS results obtained for the genome sequences of the seven bird-related rhabdovirus.
Isolate |
Virus |
Genome Size (nt) |
Raw Reads (no) |
Reads Cleaned (no) |
Mapped Reads (no) |
Average Coverage Depth * (x) |
GenBank Accession Number |
9716RCA |
BBOV |
10,969 |
3,547,036 |
3,013,558 |
15,226 |
206.23 |
MW491756 |
9717RCA |
KOLV |
10,971 |
4,881,362 |
4,214,012 |
46,475 |
629.49 |
MW491757 |
9718RCA |
OUAV |
10,805 |
12,256,546 |
9,754,242 |
28,150 |
384.67 |
MW491758 |
0408RCA |
SJAV |
10,951 |
12,164,616 |
9,908,488 |
14,166 |
191.80 |
MW491754 |
0410RCA |
NASV |
10,977 |
8,689,566 |
6,630,310 |
20,937 |
281.01 |
MW491755 |
09023EGY |
BEAV |
10,846 |
101,186 |
65,258 |
832 |
11.15 |
MW491759 |
09027EGY |
MTYV |
11,021 |
2,742,712 |
2,289,168 |
3635 |
45.58 |
MW491760 |
* Sequence coverage obtained after the last mapping round.
All of the seven bird-related viruses exhibited a typical rhabdovirus genome organization, with the five canonical genes which encode, in the following order, the N (1257−1284 nt, 418−427 aa), P (759−852 nt, 252−283 aa), M (498−522 nt, 165−173 aa), G (1644−1698 nt, 547−565 aa), and the L (6201−6219 nt, 2066−2072 aa) proteins (Figure 1). Numerous putative accessory genes (U) which present additional ORFs were also identified among the genome sequences, from two (with NASV, OUAV, and MTYV) to seven (with BBOV) (Figure 1).
Figure 1. Schematic genome organization of the seven bird-related sunrhaviruses. The gray and black arrows represent the five canonical open reading frames (ORFs) (N, P, M, G, and L), and the putative additional ORFs (≥180 nt), respectively. The length (nucleotide and amino acid) of each ORF is indicated.
Each of these viruses presented a putative alternative ORF within the P gene (264−312 nt, 87−103 aa), which corresponds to the C protein already found in other rhabdoviruses, such as Sunguru virus (SUNV), Garba virus (GARV), Durham virus (DURV), Tupaia virus (TUPV), or Klamath virus (KLAV). This protein exhibited a variable spectrum of amino acid conservation among the seven newly described bird-related viruses, including GARV (identity 26.1−80.4%), or after comparison with the three other sunrhaviruses, namely SUNV, Harrison Dam (HDV), and Walkabout Creek (WCB) viruses (10.6−28.7%), and they were highly distinct from the other sunrhaviruses and tupaviruses (Supplementary Materials, Figure S2 and Table S3A). For all of the seven bird-related viruses, this protein was predicted to be non-polar, approximately neutral, or slightly basic (7.1−8.71) and with a non-cytoplasmic location, similarly to most of the other C proteins found for the sunrhaviruses and tupaviruses, with the notable exception of Klamath virus (KLAV) and TUPV (Supplementary Materials, Table S4). Another common additional ORF (216−234 nt, 71−77 aa) to all the seven viruses was found between the M and G genes, corresponding to the small hydrophobic (SH) protein previously also observed in the tupaviruses and the previously unclassified SUNV, HDV, WCV, and GARV. Here, again, the identity of the amino-acid sequences of this SH protein was variable but higher between the seven newly described bird-related viruses and GARV (32.4−85.9%) than after comparison to the other sunrhaviruses, namely HDV, SUNV, and WCB (20.2−28.2%), or to the tupaviruses Durham virus (DURV), KLAV, and TUPV (13.8−23.3%) (Supplementary Materials, Figure S3 and Table S3B). The SH proteins of the remaining sunrhaviruses Dillard’s Draw virus (DDV), Kwatta virus (KWAV), and Oak Vale virus (OVRV) were clearly distinct from the other SH proteins (Supplementary Materials, Figure S3 and Table S3B). The topology of this acidic (4.65−6.55) and hydrophobic protein was similar for the bird-related viruses, with a signal peptide (13−20 aa), followed by an extracellular region (9 or 15 aa), a transmembrane part (18−27 aa) with two amino acid helices, and a cytoplasmic domain (22 or 28 aa) at the C-term (Supplementary Materials, Table S5). This topology is also observed with SUNV, DDV, and HDV. Lastly, BBOV, KOLV, and BEAV exhibited a putative ORF within the N gene, whereas from one to three additional ORFs were found in the G gene for SJAV and KOLV or for BBOV, respectively. None of these additional ORFs exhibited similarities with other known proteins after a BLASTp analysis of non-redundant protein sequences or Uniprot databases, with default parameters.
The transcription initiation (TI) signal was highly conserved among the five canonical genes, with the AACA sequence motif (Supplementary Materials, Figure S4). The consensus sequence of the transcription termination (TTP) for the canonical genes was TGA7, except for the M gene of KOLV and OUAV, which had TGA6 and TGA8, respectively, or for the L gene of SAJV, which exhibited the CGA7 motif. These conserved TI and TTP signal sequences were observed for only one of the putative accessory genes (SH) and for all of the seven bird rhabdoviruses. Surprisingly, the TTP sequence of the SH gene for the virus OUAV was found after the TI sequence of the next gene (G gene) (Supplementary Materials, Figure S4). Intriguingly, a TTP signal sequence was observed just before the TI signal sequence of the N gene for KOLV, suggesting the upstream presence of an additional ORF.
3. Phylogenetic Analysis of the Bird-Related Rhabdoviruses
A first maximum likelihood phylogenetic analysis was conducted on the complete amino acid sequences of the L protein of the seven bird-related rhabdoviruses, in addition to the 229 representative members of the Rhabdoviridae family available in GenBank (Figure 1, and Supplementary Materials, Table S2). Based on this phylogeny, the seven bird-related rhabdoviruses clustered together into the genus Sunrhavirus with high bootstrap supports. Within this genus, they were strongly associated with GARV, one of the other bird-related rhabdoviruses found in this genus, whereas the other bird-related rhabdovirus SUNV was found to be more genetically distant.
Figure 1. Phylogenetic classification of the seven bird sunrhaviruses. A maximum likelihood phylogenetic tree was made using IQ-TREE1.6.10, on the full amino acid sequence of the L protein, including seven bird-related sunrhaviruses and 229 other rhabdoviruses previously reported on GenBank, using the LG + G + L + F model with 10,000 ultrafast bootstraps. The rhabdovirus genera that were not related to birds were collapsed in the phylogenetic tree. Bird-related rhabdoviruses are indicated by a dedicated symbol, and the bird-related genera are shown in bold. Unclassified rhabdoviruses are indicated by black dots. The bird-related rhabdoviruses described in this study are highlighted in gray. All bootstrap proportion values (BSP) > 80% are specified. The scale bar indicates nucleotide substitutions per site.
A second phylogenetic analysis was conducted on the genus
Sunrhavirus, based on the concatenated nucleotide sequences of the five canonical genes (N, P, M, G, and L) for the seven different members already associated with this genus, in addition to the seven new sequences (
Figure 2). Interestingly, all of these seven bird-related rhabdoviruses clustered into the same phylogroup, identified as Clade III, with GARV, which was found also in birds in Africa, and more precisely in the Central African Republic (
Figure 2)
[1]. The other unique bird-related rhabdovirus of the genus not related to Clade III was SUNV, which clustered with two Australian insects rhabdoviruses into another clade (Clade II)
[17]. However, we observed a close genetic relationship between these two clades (
Figure 2). Within Clade III, all the viruses appeared to be genetically relatively distant from each other, suggesting that they could be considered to be individual species.
Figure 2. Phylogenetic classification of all members of the genus Sunrhavirus, including the seven newly described bird-related sunrhaviruses. A maximum likelihood phylogenetic tree was made with PhyML 3.0 on the nucleotide concatenated ORFs (N-P-M-G-L), using the GTR + G + I model and with 1000 bootstrap replicates. The main animal reservoirs for each virus are indicated by specific cartoons, and the seven bird sunrhaviruses described in this study are highlighted in gray. The isolation country for each virus is presented in the right of the illustration. All bootstrap proportion values (BSP) >80% are specified. The scale bar indicates nucleotide substitutions per site. The classical rabies virus (RABV) was included as outlier in the phylogenetic analysis.
4. Genetic Diversity of the Bird-Related Rhabdoviruses
In addition to the phylogenetic analysis, we compared the canonical ORFs (N, P, M, G, and L) of the seven bird-related rhabdoviruses with the other members within the genus
Sunrhavirus, at the individual level (complete amino acid sequences for each) or after concatenation (concatenated complete nucleotide sequences) (
Table 2, and
Supplementary Materials, Table S6). The close genetic relationship between these seven bird-related rhabdoviruses exhibited by the phylogeny was confirmed at the amino acid and nucleotide identity level of these canonical ORFs. These identity analyses also confirmed that these viruses were putative individual species. Indeed, they exhibited a high level of diversity between these viruses and the other members of genus
Sunrhavirus, with nucleotide identities for the concatenated sequences ranging from 41% to 71.9% among the
Sunrhavirus genus, and from 55.5% to 71.9% among the seven bird-related rhabdoviruses (
Table 2).
Table 2. Nucleotide identities of concatenated canonical ORFs (N, P, M, G, and L) of members of the Sunrhavirus genus. Identities were calculated through pairwise deletion using MEGA (version 7.0). Newly described bird-related rhabdoviruses are indicated in bold.
|
SJAV 0408RCA |
NASV 0410RCA |
BBOV 9716RCA |
KOLV 9717RCA |
OUAV 9718RCA |
BEAV 09023EGY |
MTYV 09027EGY |
GARV |
HDV |
WCV |
SUNV |
DDV |
OVRV |
KWAV |
SJAV 0408RCA |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NASV 0410RCA |
71.9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
BBOV 9716RCA |
57.9 |
56.7 |
|
|
|
|
|
|
|
|
|
|
|
|
KOLV 9717RCA |
57.9 |
56.2 |
71.3 |
|
|
|
|
|
|
|
|
|
|
|
OUAV 9718RCA |
58.4 |
56.8 |
56 |
55.7 |
|
|
|
|
|
|
|
|
|
|
BEAV 09023EGY |
57.8 |
56.3 |
56.1 |
55.5 |
58.4 |
|
|
|
|
|
|
|
|
|
MTYV 09027EGY |
58.2 |
56.7 |
56.4 |
55.8 |
58.8 |
70.1 |
|
|
|
|
|
|
|
|
GARV |
58.2 |
56.5 |
55.5 |
55.7 |
59 |
69.2 |
73.2 |
|
|
|
|
|
|
|
HDV |
51.7 |
50 |
50.8 |
50.5 |
51.5 |
51.5 |
51 |
51.1 |
|
|
|
|
|
|
WCV |
50.6 |
48.9 |
49.8 |
49.4 |
50.6 |
51.3 |
50.6 |
50.5 |
74.8 |
|
|
|
|
|
SUNV |
48.9 |
48.1 |
47.5 |
47.2 |
48.6 |
48.8 |
47.9 |
48.2 |
54 |
53.8 |
|
|
|
|
DDV |
42.4 |
41.6 |
42.3 |
41.8 |
42.4 |
42.4 |
42.3 |
42.1 |
43.4 |
42.5 |
41.5 |
|
|
|
OVRV |
41 |
40.5 |
41 |
41.5 |
41 |
41.8 |
41.6 |
41.4 |
42.4 |
41.9 |
40.8 |
65.2 |
|
|
KWAV |
42.3 |
41.4 |
41.9 |
41.7 |
41.9 |
41.6 |
41.8 |
42 |
42.7 |
42.1 |
41.2 |
52.1 |
51.9 |
|
At the individual ORF level, the N and L proteins were the most conserved among the member of the Sunrhavirus genus, with amino acid identities ranging from 23.8% to 86.6% and 38.0% to 85.2%, respectively (Supplementary Materials, Table S6). Within the seven newly described bird-related rhabdoviruses, the amino acid identities ranged from 45.3% to 81% and 58.9% to 83.1% for the N and the L proteins, respectively. The level of identity for the amino acid sequences was lower for the other viral proteins among sunrhaviruses, ranging from 5.7% to 71.1%, 7.1% to 79.7%, and 20.3% to 75.5% for the P, M, and G proteins, respectively (Supplementary Materials, Table S6). For the seven new bird-related rhabdoviruses, identities were 18.6−68.3%, 23.6–66.4%, and 36.1−72.9% for the seven bird-related rhabdoviruses for the P, M, and G proteins, respectively. Altogether, these results indicated that these seven bird-related rhabdoviruses represent new virus species within the Sunrhavirus genus.