2. Discussion
To investigate the spread of SARS-CoV-2 in Algeria, we performed a thorough analysis of all the complete and partial SARS-CoV-2 sequences available from Algeria (twenty-nine) in addition to sixty-six sequences sampled worldwide.
Our estimations regarding the MRCA of the SARS-CoV-2 Algerian pandemic under a relaxed molecular clock with the skyline model was 28 January 2020 [29 October 2019, 29 February 2020]. These results are coherent, as the restriction measures in Algeria began in mid-March 2020
[6]. The evolutionary rate of the SARS-CoV-2 pandemic in the present study was equal to 5.4043 × 10
−4 substitution/site/year as of March 2021. In parallel, the substitution rate previously reported early in the pandemic was 1.66 × 10
−3 in February 2020, whereas 8.99 × 10
−4 in early August 2020, which is in line with the time-dependent pattern of substitution rates observed in viruses
[2][42][43][44].
Moreover, the phylogenetic analysis revealed both multiple disease introductions into Algeria and disease transmissions between cities. Thus, highlighting the impact of international and domestic travels in disease spread. The first three sequenced samples from March 2020 were introduced from France to Algeria as previously demonstrated through contact tracing and phylogenetic analysis, yet; they didn’t cluster together within the current study indicating indirect contamination
[6][45]. Likewise, the discrete phylogeographic analysis of the virus expansion in Algeria emphasized between city transmissions, both vertical (from the north to the south and vice versa) and horizontal (within only northern cities, or just southern cities) transmissions were observed. For instance, within the northern part of Algeria from Bouira to Blida (BF 37.19) and Bouira to Tizi Ouzou (BF = 6.01), from the north to the south of the country, Blida to Ain Salah (BF = 5.69,) and from Adrar, a southern city to Boufarik a municipality in the town of Blida in the North of Algeria (BF = 91.69). The continuous phylogeographic analysis gave more details about the route of expansion by reconstructing the ancestral locations of the virus indicated as internal nods. Unsurprisingly, internal nods were placed in municipalities representing crossing points between several cities, such as Djebahia in the city of Bouira, where travelers take breaks. Another example is Hassi Messaoud in Ouargla, where a large oil station employing several international workers is located. The first formally considered coronavirus case in Algeria was detected in an Italian worker from this oil station
[6][46]. The phylogeographic results are in accordance with the phylogenetic analysis, emphasizing the importance of local travels and social contact in the spread of the disease. Globally, the ω ratios across all the analyzed coding genes (
ORF1a,
ORF1b,
S,
M,
N,
E,
ORF3a) for the Algerian sequences in comparison to the reference genomes were inferior to one indicating negative selection. The same results were observed in a similar study on SARS-CoV-2 in a Canadian population conducted by Zhang et al.
[47]. In addition, a recent analysis performed on 260,673 whole-genome sequences to study the selection pressure among the coding genes highlighted the rarity of positive selection in SARS-CoV-2 protein-coding genes
[48].
Complementary to this, several common non-synonymous mutations were detected among the Algerian sequences. This included T85I in the
nsp2 gene, P423L in the
nsp12 gene, D614G in the
S gene, and Q57H in the
ORF3a gene. Notably, these mutations were detected within eighty-four countries and thus considered as positively selected. Moreover, amino-acid replacements in the Spike protein characteristic of the newly identified SARS-Cov-2 variants were also identified. Namely, the H69del, V70del, E484K, Y144del, and Q52R. This was after the repatriation of Algerian nationals from abroad. Following this, Algeria enforced a full lockdown for the second time
[46]. Interestingly, characteristic non-synonymous mutations were identified. To cite an example, the T1004I replacement in the
nsp3 gene was detected in the sequence Algeria_EPI_ISL_420037. This mutation was spotted as a unique mutation in the USA in the early stages of the pandemic, in sequences from 19 January 2020 to 15 April 2020 and was not reported elsewhere. We conclude that the individual who contaminated Algeria_EPI_ISL_420037 had either a travel history to the USA or was in contact with an individual who introduced the disease to France originating from the USA
[49]. Strikingly, unique non-synonymous amino-acid changes were found. In the sequence, Algeria/EPI_ISL_766874, the amino acid substitution A130V in the
RdRp gene result in a harmful functional effect on the protein responsible for viral replication. This mutation was first reported in the United Arab Emirates on 12 June 2020. It occurred only in seventy-five samples worldwide in thirteen countries. In Algeria, it was detected on 21 June 2020, since the sample collection dates in all the other locations where this mutation was described ranged from 20 of August 2020 till 19 April 2021, excluding them from originating countries of the disease and thus linking the sample Algeria/EPI_ISL_766874 directly to the sample EPI_ISL_698151 from Abu Dhabi. Likewise, a deleterious mutation results from the non-synonymous amino acid replacement N874H in Algeria/EPI_ISL_766875 in the
NSP12 gene. This amino-acid replacement occurred the first time in Algeria and only in seven samples worldwide. Based on the collection dates, the sequence can be linked directly to the EPI_ISL_557768 genome from England sampled right after the original sequence on 6 July 2020. Similarly, In the accessory gene
ORF3a, which plays an important role in virulence, infectivity, and virus release, the deleterious mutation A23T was first reported from the USA and sampled sixty-two times in fifteen countries; thus, based on collection dates, the sequence Algeria/ EPI_ISL_766862 is related to sequences from Texas (USA)
[50].The last deleterious amino acid replacement, L129F, was found in the
NS3 gene and detected for the first time in Algeria. It occurred in nine hundred ninety-seven samples worldwide. This mutation occurred in the third functional domain of the ORF3a protein (K+ ion channel) and may seriously impact the protein function and consequently the virus phenotype
[50]. The circulation of different deleterious mutations is in line with previous reports regarding deleterious mutations in RNA viruses with zoonotic potential. The occurrence of different deleterious mutations simultaneously with the presence of stabilizing mutations may increase virus fitness. This is the case for influenza A/H5N1, which required a combination of mutations to gain airborne transmissibility, of which two were deleterious
[51]. However, the strength of the purifying selection is not sufficient to directly eliminate the deleterious mutations after their occurrence; hence they might circulate for a sufficient period to impact the viral infection path
[52]. Moreover, deleterious mutations might be used to develop treatment strategies. For instance, provoking a mutational meltdown phenomenon (population extinction) by giving a drug such as Favipiravir, which increases the accumulation rate of harmful mutations, subsequently inducing population collapse
[53]. In parallel, three neutral amino acid replacements were identified among the Algerian sequences; this implies no changes in the protein function
[54]. The E681D amino acid replacement in the protease gene (
NSP3) was first acknowledged in the Algerian genome EPI_ISL_766862 and occurring only in three samples worldwide demonstrated disease exportation from Algeria to Austria (EPI_ISL_853900) via the sample collection dates. Furthermore, in Algeria/EPI_ISL_418241, two neutral amino acid replacements were identified. The first in the exonuclease gene (
NSP14), H26Y amino acid substitution originally discovered for the first time in the aforementioned Algerian sequence and right after in the Greek sequence EPI_ISL_437907, subsequently supporting relatedness of the two genomes. In the envelope gene, the Leucine substitution with Phenylalanine in position seventy-three was first revealed from the Algerian sequence, then reported in 2795 samples worldwide. This mutation was proven earlier to alter the DLLV motif (change to DFLV). Distinctly, it may delay Tight Junction formation and therefore may hypothetically affect viral replication and/or infectivity
[55]. The above-mentioned viral mutation fingerprints might help characterize and identify both transmission patterns and superspreaders, as previously demonstrated
[56].
Meanwhile, the Algerian genomes fell within five lineages, the lineage A being considered as the root of the pandemic in which many sequences originated from China. All Algerian sequences within this lineage were partial genomes (
S,
NSP16) and were characterized with either the B.1.1.7 (UK), B.1.351 (South Africa), or B.1.525 (Nigeria) related mutations. The length of the sequence is one of the biggest drawbacks of an accurate analysis
[55]. Furthermore, lineage B.1, a large European clade corresponding approximately to the Italian outbreak, and the clade B.1.1 corresponding to a European lineage with three clear SNPs: G28881A, G28882A, G28883C were also identified amid the Algerian sequences. Similarly, the clade B.1.597 corresponding to sequences Mainly from France was determined. Interestingly, one of the Algeria sequences appertained to the B.1.36 lineage declared for the first time in February 2020 in Saudi Arabia and clustered with both an Indian (PP = 94%) and a Malaysian sequence (PP = 92%). This sequence was isolated from an eighty-two-year-old woman. Undoubtedly, the virus was imported from Saudi Arabia while performing the pilgrimage as no repatriation flights were scheduled in the destination of Malesia and India, unlike Saudi Arabia. These results are in line with reports regarding Algerian repatriation from abroad
[46].
Furthermore, as per the results mentioned above, the haplotype network analysis displayed seven median vectors amid the Algerian sequences indicating missing or unsampled data. The multiple introduction theory was clearly visible in the network, confirmed by the heterogeneity of the Algerian haplotypes.
In the present study, we demonstrated the evolution representing the Algerian pandemic in a consistent manner, while simultaneously reflecting the effectiveness of various implemented measures. Moreover, the strong correlation between the number of SARS-CoV-2 confirmed cases and the population density in each Algerian city implies the spread of the virus is primarily dependent on social contact, the awareness of the community, and the respectful compliance regarding social distancing, seeing as how lower infection cases in relatively high population density cities were observed and vice-versa. To cite an illustration, Ouargla, a city located in the southern portion of Algeria, has a population density of 2.63 habitant/km
2, although the number of confirmed cases is two thousand four hundred fifty-three. Whereases, in Bordj Bou Arreridj, situated in the Northern area of Algeria, the number of confirmed cases is five hundred six cases for 182.76 habitant/km
2. Overall, the Algerian government’s restriction measures were effective regarding disease containment and prevented catastrophic scenarios such as the Italian one
[57]. This is complementary to an epidemiological study conducted to assess the mitigation measures implemented in Algeria in the early SARS-CoV-2 pandemic (dated 26 April 2020), which demonstrated the efficiency based on the basic reproduction number R0 before and after the implementation of the preventive strategy
[6].
Overall, we explored the evolutionary, genetic, and epidemiological aspects regarding the Algerian SARS-CoV-2 pandemic, aptly demonstrating the multiple introductions of the disease and the heterogeneity of the genomes. Additionally, our research findings revealed unique amino-acid substitutions by characterizing the mutational patterns and the effect on the corresponding proteins. In addition, some concise tracing could be performed based on both unique mutations and travel history. Statistically, we assessed the effectiveness regarding the mitigation majors implemented against the SARS-CoV-2 pandemic. Admittedly, the main drawback regarding our study was the length of some sequenced genomes and the size of the Algerian data panel. Thus, we emphasized the importance of massive sampling and sequencing in disease comprehension and increased efforts regarding diagnostics, therapy, drug, and vaccine development. Given that Algeria was under complete travel restrictions since 15 March 2020, the number of cases kept increasing, indicating local transmissions. Thus, these local viral variants may potentially represent a distinct strain as previously occurred
[9].