Gregor Mendel, recognized as the Father of Modern Genetics, was an Austrian monk who established the foundational principles of heredity through his breeding experiments on the common pea (Pisum sativum) and coined the terms dominant and recessive
[1]. Mendel deemed that peas were a suitable model system due mainly to their distinct, constant differentiating characteristics and their hybrids yielding perfectly fertile progeny
[2][3]. After eight years of tedious pursuit, he finally published his work entitled “Experiments on Plant Hybridization” in 1866, proposing the principles of uniformity, segregation, and independent assortment
[1]. Compared to his contemporary Charles Darwin who developed the Theory of Evolution, Mendel’s work was not widely known until the 1900s, and its relevance fell in and out of favour as genetic theory continued to develop
[2][4]. In 1882, chromosomes were first described by Walter Flemming, the founder of the science of cytogenetics who pioneered the study of mitosis
[5]. The chromosomal theory of heredity, however, was only established in 1910 when Thomas Morgan discovered sex chromosome inheritance through his breeding analysis on millions of wild-type red-eyed and white-eyed fruit flies (Drosophila melanogaster)
[6]. Morgan’s findings confirmed Mendel’s principles of heredity, and that genes are located on chromosomes
[7].
Initial efforts to sequence a gene were rather cumbersome and time consuming, for example, it took months to sequence a mere 24-base pair lactose operon of Escherichia coli using the Maxam–Gilbert sequencing that involved extensive use of hazardous chemicals
[15]. Invented by Allan Maxam and Walter Gilbert in the mid-1970s, the Maxam–Gilbert sequencing method involves chemical alteration of DNA and subsequent cleavage at specific bases, which necessitates radioactive labelling at one end and purification of the DNA fragment of interest
[15][16]. The dawn of rapid sequencing began in 1977 when the chain termination method, better known as Sanger sequencing, was developed based on the process of DNA replication
[17]. Pioneered by Frederick Sanger, the technique was used to sequence the first DNA genome, the bacteriophage ϕX174, often used as a positive control genome in sequencing labs around the world since its completion
[17][18]. Another notable invention from that time period is polymerase chain reaction (PCR), a technique for generating millions of copies of a specific section of DNA
[19]. Invented by Kary Mullis in 1983, the technique has since been employed for a variety of applications, including decoding the human genome, preserving animals and coral reefs, and, most recently, detecting COVID-19
[20].
2. How Has the Cracking of Genetic Code Improved Life on Earth?
Life has existed on Earth for approximately four billion years, albeit the genetic code was only decoded in the 1960s after 99% of human history has been documented
[28]. Within the last half-century the world has witnessed tremendous discoveries in all critical areas of life sciences, from medicine to agriculture. The human genome was completed in the early 2000s, around the same time as a handful of model organisms, including arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and fruit fly (Drosophila melanogaster) (
Figure 1). After the birth of next-generation sequencing (NGS) technologies in the mid-2000s, hundreds of other organisms had their genomes completely sequenced, and millions of genes have been annotated, be they genes responsible for severe diseases in humans or genes conferring resistance and tolerance in crops
[29]. This is especially true for microorganisms with small genome sequences, such as viruses and single-celled organisms (bacteria and protozoa), where the number of genomes being sequenced for these organisms has been exploding
[30]. Both partial and complete COVID-19 genome sequences were obtained in the first two months of the epidemic
[31]. However, like all other viruses, the SARS-CoV-2 undergoes mutation or small changes in its genome, demonstrating that the virus is evolving
[32]. To allow the human genetics community to share important outcomes of the genetic determinants of COVID-19 susceptibility and severity, the COVID-19 Host Genetics Initiative
[33] was established in spring 2020 and the genetic association results of several gene clusters (such as TYK2, DPP9, and the OAS1/2/3) were publicly released in January 2021
[34].
Figure 1. Notable genetic discoveries in the past one and a half centuries.
Although the sequencing for multicellular organisms has slightly lagged behind, a handful of massive sequencing projects are actively ongoing, including the Earth BioGenome Project that aims to sequence the genomes of all 1.5 million known eukaryotic species, and also the Darwin Tree of Life Project, which seeks to obtain the code of 66,000 species of sequence from every animal, plant, and fungus in the United Kingdom over the course of a decade
[35][36]. To date, there are more than 16,000 sequenced genomes of eukaryotes available in the public domain (
https://www.ncbi.nlm.nih.gov).
Table 1 presents some examples of well-annotated genomes of multicellular organisms evolved in the primary eukaryotic kingdoms since the 2000s.
Table 1. Examples of multicellular organisms with well-annotated genomes.
| Kingdom |
Species |
Relevance |
Estimated Genome Size (Mbp) |
Reference |
| Animalia |
Aedes mosquito (Aedes aegypti) |
Primary vector for yellow and dengue fevers |
1380 |
[37] |
| |
Cattle (Bos taurus) |
Ruminant biology and evolution |
2870 |
[38] |
| |
Coelacanth (Latimeria chalumnae) |
Tetrapod evolution |
2860 |
[39] |
| |
Common chimpanzee (Pan troglodytes) |
Model organism (human population genetics and evolution) |
2400 |
[40] |
| |
Common marmoset (Callithrix jacchus) |
Biomedical research application |
2260 |
[41] |
| |
Giant panda (Ailuropoda melanoleuca) |
Foundation for promoting mammalian genetic research |
2250 |
[42] |
| |
Honeybee (Apis mellifera) |
Model organism (social behaviour and global ecology) |
1800 |
[43] |
| |
Japanese medaka (Oryzias latipes) |
Vertebrate evolution |
700 |
[44] |
| |
Pacific oyster (Crassostrea gigas) |
Lophotrochozoa evolution |
559 |
[45] |
| |
Platypus (Ornithorhynchus anatinus) |
Model organism (combination of reptilian and mammalian characters) |
1840 |
[46] |
| |
Red flour beetle (Tribolium castaneum) |
Model organism (beetle and pest) |
160 |
[47] |
| |
Sea urchin (Strongylocentrotus purpuratus) |
Model organism (developmental and system biology) |
814 |
[48] |
| |
Sponges (Amphimedon queenslandica) |
Animal origins and early evolution |
167 |
[49] |
| |
Two-spotted spider mite (Tetranychus urticae) |
Cosmopolitan agricultural pest |
90 |
[50] |
| |
Western gorilla (Gorilla gorilla) |
Human origins and evolution |
5400 |
[51] |
| |
Mexican oxolotl (Ambystoma mexicanum) |
Evolutionary changes in key tissue formation regulators |
32,000 |
[52] |
| |
Galapagos cormorant (Phalacrocorax harrisi) |
Evolutionary changes in the size and proportion of limbs |
1200 |
[53] |
| |
Golden orb-weaver (Nephila clavipes) |
Diversity of spider silk genes and their complex expression |
2440 |
[54] |
| Plantae |
African oil palm (Elaeis guineensis) |
Oil-bearing crop |
1800 |
[55] |
| |
Amborella (Amborella trichopoda) |
Angiosperm evolution |
870 |
[56] |
| |
Barrel medic (Medicago truncatula) |
Model organism (legume) |
246 |
[57] |
| |
China rose (Rosa chinensis) |
Model organism (ornamental plant) |
560 |
[58] |
| |
Dwarf banana (Musa acuminata) |
A genome of modern cultivar |
523 |
[59] |
| |
Maize (Zea mays) |
Major cereal crop |
2300 |
[60] |
| |
Papaya (Carica papaya) |
Tropical fruit crop |
372 |
[61] |
| |
Peanut (A. duranensis, A. ipaensis, A. hypogaea) |
Polyploid genetic mechanisms |
2540 |
[62][63] |
| |
Pigeon pea (Cajanus cajan) |
Model organism (legume) |
833 |
[64] |
| |
Potato (Solanum tuberosum) |
Major root crop |
844 |
[65] |
| |
Quinoa (Chenopodium quinoa) |
Future crop |
1500 |
[66] |
| |
Rose gum (Eucalyptus grandis) |
Fibre and timber crop |
640 |
[67] |
| |
Sorghum (Sorghum bicolor) |
Major cereal crop |
730 |
[68] |
| |
Soybean (Glycine max) |
Major protein and oil crop |
1115 |
[69] |
| |
Tomato (Solanum lycopersicum) |
Major vegetable crop |
900 |
[70] |
| |
Silver birch (Betula pendula) |
Model organism (forest biotechnology) |
440 |
[71] |
| |
Durian (Durio zibethinus) |
Tropical fruit biology and agronomy |
738 |
[72] |
| |
Sunflower (Helianthus annuus) |
Oil metabolism, flowering, and Asterid evolution |
3600 |
[73] |
| |
Tausch’s goatgrass (Aegilops tauschii) |
Genetic resources for wheat |
4300 |
[74] |
| |
Barley (Hordeum vulgare) |
Major cereal crop |
4800 |
[75] |
| |
Pearl millet (Pennisetum glaucum) |
Future crop |
1790 |
[76] |
| Fungi |
Black mold (Aspergillus niger) |
Model fungal |
34 |
[77] |
| |
Filamentous fungus (Aspergillus nidulans, A. fumigatus, A. oryzae) |
Model fungal |
40 |
[78] |
| |
Fission yeast (Schizosaccharomyces pombe) |
Model yeast |
14 |
[79] |
| |
Rice blast fungus (Magnaporthe grisea) |
Model fungal |
40 |
[80] |
| |
Split gill (Schizophyllum commune) |
Model mushroom |
39 |
[81] |
| |
Yeast (Candida albicans) |
Human pathogen |
4 |
[82] |
| |
Filamentous fungus (Penicillium chrysogenum) |
Industrial use |
32 |
[83] |
An organism’s genetic code is made up of merely four bases—A, C, G, and T, but just a change in a single base, frequently known as single nucleotide polymorphism, among thousands of bases can potentially lead to changes in protein structures and functions, impacting one or various traits of an organism. The abnormal changes in the DNA of a gene are termed gene mutations, which may have little to no noticeable effects, or can considerably affect cells in numerous ways
[86]. Some mutations cause a gene to be turned on, making more of the protein than usual, while a small percentage of mutations has been found to cause genetic disorders
[86][87]. For instance, a mutated version of the beta-globin gene that helps make haemoglobin causes sickle cell anaemia
[88]. Recently, a robust sequence-resolved benchmark set for detection of both false positive and false negative germline large insertions and deletions has been developed
[89]. The modern genetic modification, interchangeably known as genetic engineering, is the process of altering the genetic makeup of an organism using recombinant DNA (rDNA) technology. A gene (sometimes two or more) from a species is isolated, spliced into a vector with the aid of restriction enzymes, and then introduced into the host species, creating a “transgenic” organism, called a genetically modified organism (GMO), with desirable characteristics
[1]. The first genetically engineered animal (i.e., mouse; M. musculus) and genetically engineered plant (i.e., tobacco; Nicotiana tabacum) were produced in 1974 and 1983, respectively. The inception of genetically engineered technology initially sparked concern from various parties, including governments, scientists, and the media, over its potential adverse effects in human health or ecosystems worldwide
[90]. With the establishment of a safe and practical guide to rDNA research in 1975, the technology has continued to advance rapidly, impacting medicine, agriculture, and biodiversity
[91][92].
2.1. Medicine
Since the completion of the Human Genome Project, multitudinous pieces of research have been underway into human genetic diseases, with the most common one being cancer. The International Cancer Genome Consortium (ICGC) was launched in 2008 to generate genetic data for about 50 most common cancer types and/or subtypes across the globe (
https://icgc.org). In 2006, treatments targeting specific molecular abnormalities were made available for certain types of cancer, such as melanoma
[93] and lung cancer
[94], making some of these chronic illnesses manageable and possibly curable. This was made possible when Druker et al.
[95] developed imatinib (or Gleevec), a drug with high efficiency in treating chronic myelogenous leukaemia by targeting the unique molecular abnormality. To date, more than 1000 human genetic tests are practicable, and some enable embryos created from in vitro fertilization to be screened for the genetic mutations that cause genetic disorders such as sickle cell disease and cystic fibrosis
[96][97].
Gene therapy, which targets faulty or missing genes to treat disease, is at the forefront of modern medicine
[98]. While gene therapy is currently being tested only for terminal diseases like haemophilia and AIDS, this innovation has shown promising progress during the past two decades, with a handful of notable successes including treatments for the X-linked severe combined immunodeficiency and the inherited blindness Leber’s congenital amaurosis 2, which are caused by mutations in the interleukin-2 receptor γ chain (IL2RG) and the retinal pigment epithelium-specific 65 kDa protein (RPE65) genes, respectively
[99][100]. Gene therapy trials, however, can raise the risk of severe side effects which can lead to death
[101][102], and this still-evolving molecular medicine may require many more years of testing to be proven effective and safe for most conditions
[98][103].
Different types of gene-targeting vectors have been designed to elucidate gene function in vivo, from point mutations and insertions to gene deletions
[104]. Before individual genome sequencing becomes routine, DNA (or gene) chips can be considered as one of the critical pharmacogenetics technologies. Featuring a tiny DNA microarray, gene chips reveal the level of activation of particular genes and assess a patient’s genetic suitability for certain drugs
[105]. Technological advancements in sequencing have facilitated the integration of pharmacogenetics (or pharmacogenomics) in clinical diagnostics, allowing doctors to prescribe medication based mainly on their individual patient’s genetics rather than factors like age and body mass
[106]. With the information on how genes influence the response of different individuals on the same drug or medication, treatments can be selected more accurately, and the significant side effects of a specific drug on certain individuals can be avoided
[107][108]. Precision approaches are promising in protecting population health and addressing global health landscape challenges (such as the spread of SARS-CoV-2 infections), but they should be complementing rather than replacing the efforts to strengthen public health infrastructure
[108]. Recently, several genetic markers associated with SARS-CoV-2 and COVID-19 disease severity have been identified, including FOXP4 and TYK2, which are linked to lung cancer and autoimmune diseases, respectively
[33].
2.2. Agriculture
The ancient genetic modification that involved mainly selective breeding and artificial selection occurred more than 32,000 years ago, and the first artificially selected organism was thought to be the domesticated dog (Canis lupus familiaris), the descendant of grey wolf (Canis lupus)
[109]. Although the early utilization of the genetically engineered technology ranged from drug discovery to the production of biorenewables, the most controversial application of the technology was and perhaps is for food production. The first crop to be genetically altered was the tomato (Solanum lycopersicum). The product Flavr Savr received marketing approval from the United States Department of Agriculture (USDA) in 1994 after years of extensive field experiments and health testing. The Flavr Savr was created by introducing a reverse-orientation copy of the polygalacturonase gene, which suppresses/shuts down the formation of the polygalacturonase enzyme that dissolves cell-wall pectin in conventional tomatoes, allowing the crop to stay firm longer after harvest
[110].
Apart from extending the shelf life of food, the genetically engineered technology has been used to produce pesticide-resistant (or tolerant) plants that are easier to manage and cultivate. Two remarkable examples are Bt maize (Zea mays) and Bt cotton (Gossypium hirsutum) which have become the predominant varieties grown in the United States since their establishment in the mid-1990s. The genes of delta-endotoxins (δ-endotoxins) from the soil bacterium Bacillus thuringiensis encode Cry proteins which are specifically toxic to certain insect orders such as Lepidoptera, Diptera, and Coleoptera
[111]. Nonetheless, Bt plants have been reported to be highly vulnerable to certain insect pests that proliferate in some countries such as India, making farming in these countries more capital-intensive
[112]. Genetically engineered herbicide-resistant crops have also been created to control unwanted plants in fields efficiently, with the most eminent examples being the glyphosate (N-(phosphonomethyl)glycine)-resistant crops
[113].
Biofortification through rDNA or metabolic technology has also been attempted to increase the nutrition value of some staple crops
[114]. Golden rice, for example, was developed to combat vitamin A deficiencies in developing countries
[115]. In nature, the machinery to synthesize beta (β)-carotene (provitamin A) is fully active in rice leaves but partially turned off in its grains. The pathway is turned back on in golden rice with the addition of two genes encoding phytoene synthase (psy) and carotene desaturase (crtI) via genetic engineering, allowing β-carotene to accumulate in the grains
[116]. In July 2021, the Philippines became the first country to approve the commercial production of the golden rice. It was recently reported that multiple biofortification traits (such as high provitamin A, high iron, and high zinc) can be introduced through metabolic engineering via transgenic technology. However, there have been no reported examples of sufficient nutrient enhancement through genome-editing approaches to date, and the combination of genetic engineering and conventional breeding is considered the most powerful approach when aiming at multi-nutrient crops
[114].
The genetics of the first transgenic animal were successfully altered way back in 1986 by inserting a portion of the SV40 virus and herpes simplex virus (HSV) gene which encodes thymidine kinase (TK) into an early-stage mouse (M. musculus) embryo to develop cancer
[117]. Since then, OncoMice have been a typical model organism in clinical studies. The latest tools to create transgenic animals for human disease studies, including CRISPR/Cas9 systems and transcription activator-like effector nuclease (TALEN), are summarised in Volobueva et al.
[118]. Many species of animals have been genetically engineered to fend off pollution and starvation, from transgenic pigs (Sus scrofa domesticus) capable of producing an environmentally friendly form of manure to transgenic salmon (Salmo salar) that grow to a marketable size within 1.5 instead of 3 years
[119]. Presently, there have been no genetically engineered animals approved to enter the human food chain, although the first biopharmaceutical product produced by genetically engineered goats (Capra aegagrus hircus) called the ATryn, an anticoagulant to treat a rare blood clotting disorder, was approved in 2009
[120][121].
The world may be on the brink of agreeing on the production of the first genetically engineered animal for human consumption, but the debate surrounding controversy in both animal transgenic and cloning technologies will live on. This is the case for the far-famed Dolly the Sheep (Ovis aries), the first mammal successfully cloned following somatic cell nuclear transfer from an established cell line
[122]. Dolly was revealed to the world in 1996 and her death six years later was as controversial as her life when she only managed to live for about half of the life expectancy for sheep
[123]. The announcement of the birth of the first gene-edited, cloned macaque monkeys (Macaca fascicularis) using CRISPR for Brain and Muscle ARNT-Like 1 (BMAL1) knockout has sparked outrage from animal welfare advocates and researchers around the globe
[124][125]. Nonetheless, genetically engineered technologies can be beneficial if they are done right. A recent study reported several genetically engineered mouse models that may be useful for SARS-CoV-2 research to combat COVID-19
[126].
2.3. Biodiversity
The successful development of genetically modified bacteria (E. coli), more specifically GE insulin in 1973, marked the first breakthrough of the technology in the field of medicine
[127], leading to the production of the diabetes drug Humulin. In the 1980s, several other bacteria were being genetically engineered. One notable example is the genetically modified Pseudomonas putida, which can help in oil spill mitigation with its ability to break down multiple components of crude oil
[128]. During the last decade, genetically modified bacteria have been used to produce various bio-based products, including recombinantly produced chymosin (or rennin) in cheese production and the first-generation bioplastics and biofuels
[129][130]. The genetic manipulation of the model cyanobacterium Synechocyctis sp. PCC6803, for example, has led to an increase in production of bioplastic polyhydroxybutyrate through overexpressions of Rre37 and SigE; the two major proteins involved in polyhydroxybutyrate synthesis
[131]. More recently, the most abundant polyester plastic polyethylene terephthalate was successfully engineered to break down and recycle bottles
[132].
Climate change, one of the defining issues of the 21st century, is predicted to be the cause of extinction for up to 40% of existing species in the next 30 years
[133][134]. Biodiversity, as well as evolution and conservation, are becoming increasingly important as a result of climate change and habitat loss that can lead to extinction
[135][136]. Facilitated adaptation, where gene variants from a well-adapted population are transferred into the genomes of threatened populations of either the same or different species, has been set forth to mitigate maladaptation and avert extinction. Nevertheless, this intervention may benefit only certain species and carries its own set of challenges and complications
[137]. While some reports stated that climate-related local extinctions have occurred in hundreds of species, the equivalent number of species have merged to survive at their warm-edge range. This implies that genetic adaptations or phenotypic plasticity may enable some populations to tolerate warmer conditions
[138]. Intraspecific adaptations should, therefore, be taken into account when assessing species’ vulnerability to climate change
[139], though the prevailing issue is the absence of robust methodologies that fully allow the incorporation of genomic information in projecting species responses to a changing climate and in strategizing conservation plans
[136][140]. Some species may survive climate change by either dispersing or niche shifts, or both
[140][141].
International conservation policy recognises three levels of biodiversity: genetic, species, and ecosystem, all of which should be retained by conservation management
[142]. Over the past three decades, many of the genetic, ecological, and geographical factors that contribute to species speciation have been well established, mainly due to the maturation of both theoretical and empirical speciation research
[143]. One recent example is the study on the dynamics of explosive diversification and accumulation of species diversity based on the assembly of 100 cichlid genomes
[144]. The rapid succession of speciation events within explosive adaptive radiation was reported to depend primarily on the exceptional genomic potential of the cichlids, which is driven by the high density of ancient indel polymorphisms that are mostly linked to ecological divergence
[144]. Nonetheless, it is worth noting that the loss of genetic diversity in both terrestrial and marine ecosystems has accelerated during the last few decades, spurred largely by anthropogenic activities such as agriculture and industry
[145]. Maintaining resilience, community function, evolutionary potential, and adaptive capacity in these ecosystems through the maintenance of genetic diversity is among the central components of the Sustainable Development Goals (SDGs), including SDG 14 Life Below Water and SDG 15 Life On Land. According to Jung et al.
[136], spatial guidance is required to determine which land areas can potentially generate the greatest synergies between biodiversity conservation and nature’s contributions to humanity in order to support goal setting, strategies, and action plans for the biodiversity and climate conventions.