Submitted Successfully!
To reward your contribution, here is a gift for you: A free trial for our video production service.
Thank you for your contribution! You can also upload a video entry or images related to this topic.
Version Summary Created by Modification Content Size Created at Operation
1 + 1459 word(s) 1459 2021-02-19 03:28:26 |
2 format correct -21 word(s) 1438 2021-03-02 03:17:42 |

Video Upload Options

Do you have a full video?


Are you sure to Delete?
If you have any further questions, please contact Encyclopedia Editorial Office.
Tahir Ul Qamar, M. SARS-CoV-2 Genomics. Encyclopedia. Available online: (accessed on 24 April 2024).
Tahir Ul Qamar M. SARS-CoV-2 Genomics. Encyclopedia. Available at: Accessed April 24, 2024.
Tahir Ul Qamar, Muhammad. "SARS-CoV-2 Genomics" Encyclopedia, (accessed April 24, 2024).
Tahir Ul Qamar, M. (2021, March 01). SARS-CoV-2 Genomics. In Encyclopedia.
Tahir Ul Qamar, Muhammad. "SARS-CoV-2 Genomics." Encyclopedia. Web. 01 March, 2021.
SARS-CoV-2 Genomics

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a great threat to public health, being a causative pathogen of a deadly coronavirus disease (COVID-19). It has spread to more than 200 countries and infected millions of individuals globally. Although SARS-CoV-2 has structural/genomic similarities with the previously reported SARS-CoV and MERS-CoV, the specific mutations in its genome make it a novel virus. Available therapeutic strategies failed to control this virus. Despite strict standard operating procedures (SOPs), SARS-CoV-2 has spread globally and it is mutating gradually as well.

SARS-CoV-2 pandemic genomic characterization pathophysiology therapeutic strategies COVID-19 vaccines

1. Introduction

The emergence and re-emergence of pathogens is a global human health concern [1]. Coronaviruses are enveloped, their genomes are non-segmented, and they are single-stranded positive-sense RNA (+ssRNA) viruses belonging to the family Coronaviridae and order Nidovirales, which are widely dispersed in humans, animals, and birds. Coronaviruses cause various life-threatening diseases from respiratory infections to hepatic, enteric, and severe neurological diseases [2][3]. Six species of Coronaviruses were known to cause human diseases [4], out of which four (HKU1, NL63, 229E, and OC43) are widespread and responsible for the common cold in individuals with a weak immune response [4]. SARS-CoV-2 is the seventh coronavirus known to infect humans. Its exact origin is unknown; however, it shows homology with the previously identified coronavirus strains SARS-CoV (intermediate host, masked palm civet) and MERS-CoV (intermediate host, dromedary camel) [5][6]. The homology between SARS-CoV-2 and SARS-CoV is 82.45%, and the homology between SARS-CoV-2 and MERS-CoV is 69.58% [7]. SARS-CoV was responsible for SARS outbreaks in 2002–03 in Guangdong Province, China [8][9][10], while MERS-CoV was responsible for respiratory illness in the Middle East in 2012–13 [11]. The mortality rates of MERS and SARS were 37% and 10%, respectively [12][13]. SARS-CoV-2 triggered the COVID-19 pandemic, which spread rapidly worldwide and has become a public health concern [14].

2. Insights into Genomic Organization

Coronaviruses, which belong to the Coronaviridae family, are enveloped and pleomorphic viruses [15]. These are positive-sense RNA viruses with a genome size of 30 kb; which appears to be the largest size for a RNA virus, containing a 5′ cap and 3′ poly A-tail. Coronaviruses have a helical and flexible nucleocapsid. The membrane of these viruses contains a membrane glycoprotein, enveloped protein, and spike protein while the RNA is surrounded by nucleocapsid [16][17].

Virus RNA contains 6 open reading frames (ORF1ab, ORF3a, ORF6, ORF7ab, ORF8, and ORF10). Two-thirds of the virus genome comprises 1a/1b ORF and the remaining one-third of the genome code is used for M (membrane), S (spike), N (nucleocapsid), and E (enveloped) viral structural proteins [18][19].

Transcription was carried out by the synthesis of sgRNA (sub-genomic RNA) and replication-transcription complex (RTC), enveloped in double-membrane vesicles. Transcription termination occurred through transcription regulatory sequences that are present in between open reading frames (ORFs). There are 6 ORFs in the SARS-CoV-2 genome, as discussed above [18]. A frameshift mutation in ORF1a and ORF1b produces polypeptides (pp1a and pp1ab), which are further processed by virally encoded proteases such as main proteases (Mpro), chymotrypsin-like proteases (3CLpro), or by papain-like proteases for the production of non-structural proteins (nsps) [20][21]. Besides 1a and 1b open reading frames (ORFs), all other ORFs are responsible for the production of structural proteins (membrane, nucleocapsid, enveloped, and spike proteins), as shown in Figure 1.

Through sequence analysis of SARS-CoV-2 and SARS-CoV, scientists proposed a mutation in the spike protein responsible for the jumping of the virus from animals to humans [22]. Similarly, some mutations have also been found in protein sequences which lead to the formation of proteins with a change in amino acid residues. For example, at position 723, instead of glycine there is a serine, while at position 1010 there is proline instead of isoleucine [22]. Potential disease recurrence depends on the evolution of the virus due to the accumulation of mutations in the viral genome over time.

Figure 1. Complete structural and genomic organization of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [23].

2.1. Genome Sequencing

Through genomic sequence analysis, it has been confirmed that although SARS-CoV-2 has many similarities with SARS-CoV and other related coronaviruses, it is a novel virus (Table 1). The virus made a shift in the host organism from animals to humans with a few unique modifications/mutations. Genome sequence analysis suggests that most of the viral contigs/reads had a similarity with the genome of beta-coronavirus. SARS-CoV-2 has 96.20% and 88.00% levels of similarity to the previously published SARSr-CoV (RaTG13) and bat-SL-CoVZC45 genomes, respectively. The sequencing of the SARS-CoV-2 genome from another study indicated 69.58% and 82.45% sequence similarity with MERS-CoV and SARS-CoV genomes, respectively [24]. Ten viral genome sequences obtained from 9 patients exhibited 99.98% sequence identity. In another study, sequences from eight patient samples had 99.98% sequence identity with each other across the whole genome [24]. BLASTn search of SARS-CoV-2 sequences has identified matches from the most closely related previously known viruses: SARS-like beta-coronavirus of bat origin, bat-SL-CoVZC45 (sequence identity 88%; query coverage 99%), and bat-SL-CoVZXC21 (sequence identity 88%; query coverage 98%). In 5 gene regions (7, M, N, 14, and E), sequence identity was more than 90% with 98.7% as the highest level for the envelope (E) gene. The Spike (S) gene demonstrated the lowest sequence identity of 75%. However, the sequence identity in 1a and 1b gene regions was 90% and 87%, respectively [24]. The majority of proteins encoded by SARS-CoV-2 were highly similar to proteins encoded by bat-related coronaviruses with a few insertions and deletions [24]. However, protein 13 and the S protein revealed 73.2% and 80% identity with other bat-derived viral proteins, respectively [25]. SARS-CoV-2 encoded a large spike protein, which is a major distinguishing feature among SARS-CoV-2, SARS-CoV, MERS-CoV, and other bat-derived coronaviruses. SARS-CoV-2 exhibits the same genomic organization as bat-SL-CoVZXC21, SARS-CoV, and bat-SL-CoVZC45, as revealed by comparison of predicted coding regions. Ten coding regions were identified including E, M, N, S, 10ab, 9, 8, 7, 3, and 1ab [24].

Table 1. Sequence homology between SARS-CoV-2 and other coronaviruses strains.

Coronaviruses Strains

Sequence Similarity























2.2. Phylogenetic Analysis

Phylogenetic analysis of SARS-CoV-2 genomes obtained from early patient samples suggested similarity in the sequence organization with beta-coronaviruses such as 5′ UTR (untranslated region), replicase complex (orf1ab), 4 genes (M, N, S, and E), 3′ UTRs (untranslated regions1), and some unidentified non-structural ORFs (open reading frames) [26]. Instead of having sequence similarity with beta-coronaviruses discovered in bats, SARS-CoV-2 is distinct from SARS-CoV, as well as MERS-CoV. Another piece of evidence pointing to its novelty is that the sequence identity in conserved replicase domains (ORF 1ab) is less than 90% between SARS-CoV-2 and other members of beta-coronaviruses and sarbeco-virus sub-genus of the Coronaviridae family.

2.3. Conserved Proteins

The S protein is responsible for membrane fusion and receptor binding. It is also critical in controlling virus transmission capacity and host tropism. The S protein of SARS-CoV-2 has two domains, namely the S1 and S2 domains. The S1 domain is responsible for receptor binding, while the S2 domain for membrane fusion [27]. It has been reported that a cellular protease (furin) is responsible for the cleavage of S1/S2 sites and this cleavage is necessary for the entry of virion in human lung cells and S protein facilitated cell fusion [28]. The S1 and S2 domains of SARS-CoV-2 have a sequence similarity of 93% and 68% with bat-SL-CoVZXC21 and bat-SL-CoVZC45, respectively [29]. Among sarbeco-coronaviruses, amino acid variations in S protein were identified. Although SARS-CoV and SARS-CoV-2 belong to different clades in the phylogenetic tree, they have 50 conserved amino acids in the S1 domain of the S protein. However, MERS-CoV has mutational differences in S proteins. Most of these mutational events occur in the C-terminal domain. Several other proteases are also involved in different processes, such as entry of the virion, maturation of polyprotein, and assembly of different virion particles [30]. Other than the S protein, a variety of SARS-CoV-2 other proteins show similarity with proteins of other Coronaviridae family members, as shown in Table 2.

Table 2. Percentage identity between proteins of SARS-CoV-2 and the Coronaviridae family [31].


SARS NC_004718.3

Bat MG772934.1

Bat DQ022305.2





























S (Spike)




E (Envelope)




M (Membrane)




N (Nucleo-capsid)




2.4. Receptor Binding Domain (RBD)

The RBD of SARS-CoV-2 is found in the C-terminal domain of spike protein as in SARS-CoV, Bat CoV HKU4, and MERS-CoV [32][33]. It was also reported that SARS-CoV-2 uses ACE2 (angiotensin-converting enzyme) as a cell receptor for entry into the human cells [34]. From the phylogenetic analysis, it was found that at genome level, SARS-CoV-2 is closely related to bat-SL-CoVZXC21 and bat-SL-CoVZC45, though the RBD of SARS-CoV-2 is highly similar to SARS-CoV. However, key residues of the receptor-binding domain responsible for the binding of the receptor were different in SARS-CoV-2 as compared to SARS-CoV. From the above studies, it is again established that although SARS-CoV-2 has a great similarity with MERS-CoV, SARS-CoV, and some other bat-derived coronaviruses, it is a novel version of coronavirus and is responsible for an infection that is spreading globally.


  1. Sohrab, S.S.; Azhar, E.I. Genetic diversity of MERS-CoV spike protein gene in Saudi Arabia. J. Infect. Public Health 2020, 13, 709–717, doi:10.1016/j.jiph.2019.11.007.
  2. Weiss, S.R.; Leibowitz, J.L. Coronavirus Pathogenesis. Adv. Virus Res. 2011, 81, 85–164, doi:10.1016/B978-0-12-385885-6.00009-2.
  3. Zhu, N.; Zhang, D.; Wang, W.; Li, X.; Yang, B.; Song, J.; Zhao, X.; Huang, B.; Shi, W.; Lu, R.; et al. A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 2020, 382, 727–733, doi:10.1056/NEJMoa2001017.
  4. Su, S.; Wong, G.; Shi, W.; Liu, J.; Lai, A.C.K.; Zhou, J.; Liu, W.; Bi, Y.; Gao, G.F. Epidemiology, Genetic Recombination, and Pathogenesis of Coronaviruses. Trends Microbiol. 2016, 24, 490–502.
  5. Yu, F.; Du, L.; Ojcius, D.M.; Pan, C.; Jiang, S. Measures for diagnosing and treating infections by a novel coronavirus responsible for a pneumonia outbreak originating in Wuhan, China Fei. Microbes Infect. 2020, 22, 74–79, doi:10.1016/j.micinf.2020.01.003.
  6. Zhao, J.; Cui, W.; Tian, B. The potential intermediate hosts for SARS-CoV-2. Front. Microbiol. 2020, 11, 1814–1820.
  7. Kaur, N.; Singh, R.; Dar, Z.; Bijarnia, R.K.; Dhingra, N.; Kaur, T. Genetic comparison among various coronavirus strains for the identification of potential vaccine targets of SARS-CoV2. Infect. Genet. Evol. 2020, 89, 104490.
  8. Zhong, N.; Zheng, B.; Li, Y.; Poon, L.; Lancet, Z.X.-T. Epidemiology and Cause of Severe Acute Respiratory Syndrome (SARS) in Guangdong, People’s Republic of China, in February. Lancet 2003, 362, 1353–1358.
  9. Ksiazek, T.G.; Erdman, D.; Goldsmith, C.S.; Zaki, S.R.; Peret, T.; Emery, S.; Tong, S.; Urbani, C.; Comer, J.A.; Lim, W.; et al. A novel coronavirus associated with severe acute respiratory syndrome. N. Engl. J. Med. 2003, 348, 1953–1966, doi:10.1056/NEJMoa030781.
  10. Drosten, C.; Günther, S.; Preiser, W.; Van der Werf, S.; Brodt, H.R.; Becker, S.; Rabenau, H.; Panning, M.; Kolesnikova, L.; Fouchier, R.A.M.; et al. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N. Engl. J. Med. 2003, 348, 1967–1976, doi:10.1056/NEJMoa030747.
  11. Zaki, A.M.; Van Boheemen, S.; Bestebroer, T.M.; Osterhaus, A.D.M.E.; Fouchier, R.A.M. Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia. N. Engl. J. Med. 2012, 367, 1814–1820, doi:10.1056/NEJMoa1211721.
  12. WHO. Summary of Probable SARS Cases with Onset of Illness from 1 November 2002 to 31 July 2003. Available online: (accessed on 1 January 2021).
  13. Assiri, A.; McGeer, A.; Perl, T.M.; Price, C.S.; Al Rabeeah, A.A.; Cummings, D.A.T.; Alabdullatif, Z.N.; Assad, M.; Almulhim, A.; Makhdoom, H.; et al. Hospital outbreak of middle east respiratory syndrome coronavirus. N. Engl. J. Med. 2013, 369, 407–416, doi:10.1056/NEJMoa1306742.
  14. Wong, G.; Liu, W.; Liu, Y.; Zhou, B.; Bi, Y.; Microbe, G.G.-C. MERS, SARS, and Ebola: The Role of Super-Spreaders in Infectious Disease. Cell Host Micro. 2015, 18, 398–401.
  15. Peiris, J.S.M.; Guan, Y.; Yuen, K.Y. Severe acute respiratory syndrome. Nat. Med. 2004, 10, S88–S97, doi:10.1038/nm1143.
  16. Neuman, B.W.; Adair, B.D.; Yoshioka, C.; Quispe, J.D.; Orca, G.; Kuhn, P.; Milligan, R.A.; Yeager, M.; Buchmeier, M.J. Supramolecular Architecture of Severe Acute Respiratory Syndrome Coronavirus Revealed by Electron Cryomicroscopy. J. Virol. 2006, 80, 7918–7928, doi:10.1128/jvi.00645-06.
  17. Barcena, M.; Oostergetel, G.T.; Bartelink, W.; Faas, F.G.A.; Verkleij, A.; Rottier, P.J.M.; Koster, A.J.; Bosch, B.J. Cryo-electron tomography of mouse hepatitis virus: Insights into the structure of the coronavirion. Proc. Natl. Acad. Sci. USA 2009, 106, 582–587, doi:10.1073/pnas.0805270106.
  18. Tsai, P.H.; Wang, M.L.; Yang, D.M.; Liang, K.H.; Chou, S.J.; Chiou, S.H.; Lin, T.H.; Wang, C.T.; Chang, T.J. Genomic variance of Open Reading Frames (ORFs) and Spike protein in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). J. Chin. Med. Assoc. 2020, 83, 725–732, doi:10.1097/JCMA.0000000000000387.
  19. Alamri, M.A.; Qamar, M.T.U.; Mirza, M.U.; Alqahtani, S.M.; Froeyen, M.; Chen, L.-L. Discovery of human coronaviruses pan-papain-like protease inhibitors using computational approaches. J. Pharm. Anal. 2020, 10, 546–559.
  20. Perlman, S.; Netland, J. Coronaviruses post-SARS: Update on replication and pathogenesis. Nat. Rev. Microbiol. 2009, 7, 439–450.
  21. Alamri, M.A.; Qamar, M.T.U.; Mirza, M.U.; Bhadane, R.; Alqahtani, S.M.; Muneer, I.; Froeyen, M.; Salo-Ahen, O.M.H. Pharmacoinformatics and molecular dynamics simulation studies reveal potential covalent and FDA-approved inhibitors of SARS-CoV-2 main protease 3CLpro. J. Biomol. Struct. Dyn. 2020, doi:10.1080/07391102.2020.1782768
  22. Angeletti, S.; Benvenuto, D.; Bianchi, M.; Giovanetti, M.; Pascarella, S.; Ciccozzi, M. COVID-2019: The role of the nsp2 and nsp3 in its pathogenesis. J. Med. Virol. 2020, 92, 584–588, doi:10.1002/jmv.25719.
  23. Qamar, M.T.U.; Rehman, A.; Tusleem, K.; Ashfaq, U.A.; Qasim, M.; Zhu, X.; Fatima, I.; Shahid, F.; Chen, L.L. Designing of a next generation multiepitope based vaccine (MEV) against SARS-COV-2: Immunoinformatics and in silico approaches. PLoS ONE 2020, 15, e0244176, doi:10.1371/journal.pone.0244176.
  24. Lu, R.; Zhao, X.; Li, J.; Niu, P.; Yang, B.; Wu, H.; Wang, W.; Song, H.; Huang, B.; Zhu, N.; et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding. Lancet 2020, 395, 565–574, doi:10.1016/S0140-6736(20)30251-8.
  25. Naqvi, A.A.T.; Fatima, K.; Mohammad, T.; Fatima, U.; Singh, I.K.; Singh, A.; Atif, S.M.; Hariprasad, G.; Hasan, G.M.; Hassan, M.I. Insights into SARS-CoV-2 genome, structure, evolution, pathogenesis and therapies: Structural genomics approach. Biochim. Biophys. Acta (BBA)-Mol. Basis Dis. 2020, 1866, 165878.
  26. Qamar, M.T.U.; Alqahtani, S.M.; Alamri, M.A.; Chen, L.-L. Structural basis of SARS-CoV-2 3CLpro and anti-COVID-19 drug discovery from medicinal plants. J. Pharm. Anal. 2020, 10, 313–319.
  27. Muhseen, Z.T.; Hameed, A.R.; Al-Hasani, H.M.H.; Qamar, M.T.U.; Li, G. Promising terpenes as SARS-CoV-2 spike receptor-binding domain (RBD) attachment inhibitors to the human ACE2 receptor: Integrated computational approach. J. Mol. Liq. 2020, 320, 114493.
  28. Hoffmann, M.; Kleine-Weber, H.; Pöhlmann, S. A Multibasic Cleavage Site in the Spike Protein of SARS-CoV-2 Is Essential for Infection of Human Lung Cells. Mol. Cell 2020, 78, 779–784, doi:10.1016/j.molcel.2020.04.022.
  29. Ren, L.L.; Wang, Y.M.; Wu, Z.Q.; Xiang, Z.C.; Guo, L.; Xu, T.; Jiang, Y.Z.; Xiong, Y.; Li, Y.J.; Li, X.W.; et al. Identification of a novel coronavirus causing severe pneumonia in human: A descriptive study. Chin. Med. J. 2020, 133, 1015–1024, doi:10.1097/CM9.0000000000000722.
  30. Gioia, M.; Ciaccio, C.; Calligari, P.; Simone, G.D.; Sbardella, D.; Tundo, G.; Francesco, G.; Di, A.; Di, D. Role of proteolytic enzymes in the COVID-19 infection and promising therapeutic approaches. Biochem. Pharmacol. 2020, 182, 1–22, doi:10.1016/j.bcp.2020.114225.
  31. Ceraolo, C.; Giorgi, F.M. Genomic variance of the 2019-nCoV coronavirus. J. Med. Virol. 2020, 92, 522–528, doi:10.1002/jmv.25700.
  32. Lau, S.K.P.; Li, K.S.M.; Tsang, A.K.L.; Lam, C.S.F.; Ahmed, S.; Chen, H.; Chan, K.-H.; Woo, P.C.Y.; Yuen, K.-Y. Genetic Characterization of Betacoronavirus Lineage C Viruses in Bats Reveals Marked Sequence Divergence in the Spike Protein of Pipistrellus Bat Coronavirus HKU5 in Japanese Pipistrelle: Implications for the Origin of the Novel Middle East Respiratory Sy. J. Virol. 2013, 87, 8638–8650, doi:10.1128/jvi.01055-13.
  33. Du, L.; He, Y.; Zhou, Y.; Liu, S.; Zheng, B.J.; Jiang, S. The spike protein of SARS-CoV—A target for vaccine and therapeutic development. Nat. Rev. Microbiol. 2009, 7, 226–236, doi:10.1038/nrmicro2090.
  34. Lan, J.; Ge, J.; Yu, J.; Shan, S.; Zhou, H.; Fan, S.; Zhang, Q.; Shi, X.; Wang, Q.; Zhang, L.; et al. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature 2020, 581, 215–220, doi:10.1038/s41586-020-2180-5.
Subjects: Nursing
Contributor MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to :
View Times: 621
Revisions: 2 times (View History)
Update Date: 02 Mar 2021