Genetics Matters: History
Please note this is an old version of this entry, which may differ significantly from the current revision.

The understanding of how genetic information may be inherited through generations was established by Gregor Mendel in the 1860s when he developed the fundamental principles of inheritance. The science of genetics, however, began to flourish only during the mid-1940s when DNA was identified as the carrier of genetic information. The world has since then witnessed rapid development of genetic technologies, with the latest being genome-editing tools, which have revolutionized fields from medicine to agriculture. This review walks through the historical timeline of genetics research and deliberates how this discipline might furnish a sustainable future for humanity.

  • agriculture
  • biodiversity
  • heredity
  • gene-editing
  • genetic technolgies
  • medicine
  • sustainability

1. A Trip down Memory Lane: One and a Half Centuries into the Intriguing Study of Heredity

Gregor Mendel, recognized as the Father of Modern Genetics, was an Austrian monk who established the foundational principles of heredity through his breeding experiments on the common pea (Pisum sativum) and coined the terms dominant and recessive [1]. Mendel deemed that peas were a suitable model system due mainly to their distinct, constant differentiating characteristics and their hybrids yielding perfectly fertile progeny [2,3]. After eight years of tedious pursuit, he finally published his work entitled “Experiments on Plant Hybridization” in 1866, proposing the principles of uniformity, segregation, and independent assortment [1]. Compared to his contemporary Charles Darwin who developed the Theory of Evolution, Mendel’s work was not widely known until the 1900s, and its relevance fell in and out of favour as genetic theory continued to develop [2,4]. In 1882, chromosomes were first described by Walter Flemming, the founder of the science of cytogenetics who pioneered the study of mitosis [5]. The chromosomal theory of heredity, however, was only established in 1910 when Thomas Morgan discovered sex chromosome inheritance through his breeding analysis on millions of wild-type red-eyed and white-eyed fruit flies (Drosophila melanogaster) [6]. Morgan’s findings confirmed Mendel’s principles of heredity, and that genes are located on chromosomes [7].
A gene, the basic unit of heredity, is made up of deoxyribonucleic acid (DNA) which was first characterized by Friedrich Miescher about 150 years ago in 1871 [8]. While examining proteins in leucocytes, Miescher obtained a novel substance in the nuclei that differed fundamentally from proteins, which he termed nuclein [9]. Like Mendel, while Miescher’s discovery was well ahead of its time, unfortunately nuclein remained mostly unknown until the interest in the DNA molecule was revived around the mid-1940s. This was when Oswald Avery and his colleagues published the first evidence of DNA, instead of protein, as the carrier of genetic information in their transformation experiments using pneumococcus bacteria [10]. This work also led to the revelation that the three-dimensional structure of DNA exists as a double helix, deciphered by James Watson and Francis Crick in 1953. Nucleotide, the basic building block of DNA, is composed of a five-carbon sugar molecule (i.e., deoxyribose), a phosphate group, and one of the four nitrogen bases, specifically adenine (A), cytosine (C), guanine (G), and thymine (T) which provide the underlying genetic basis (i.e., the genotype) for informing a cell what to do and what kind of specialized cell to become (i.e., the phenotype). The Watson–Crick model has been both highly acclaimed and controversial, where the latter stems largely from the fact that their work was directly dependent on the research of several scientists before them, including Maurice Wilkins and Rosalind Franklin [11,12].
In the early 1960s, the genetic code, which consists of 64 triplets of nucleotides was decoded by Nirenberg, et al. [13], followed by the establishment of the central dogma of biology which explains the flow of genetic information from gene sequence to protein product through three fundamental processes, namely replication, transcription, and translation. Nevertheless, the first violation of central dogma was reported in less than a decade later in 1970 when David Baltimore and Howard Temin discovered reverse transcriptase in retroviruses, demonstrating the possibility of the reverse transmission of genetic information from ribonucleic acid (RNA) to DNA. Their discovery has revolutionized molecular biology and formed the cornerstone of cancer biology and retrovirology [14].
Initial efforts to sequence a gene were rather cumbersome and time consuming, for example, it took months to sequence a mere 24-base pair lactose operon of Escherichia coli using the Maxam–Gilbert sequencing that involved extensive use of hazardous chemicals [15]. Invented by Allan Maxam and Walter Gilbert in the mid-1970s, the Maxam–Gilbert sequencing method involves chemical alteration of DNA and subsequent cleavage at specific bases, which necessitates radioactive labelling at one end and purification of the DNA fragment of interest [15,16]. The dawn of rapid sequencing began in 1977 when the chain termination method, better known as Sanger sequencing, was developed based on the process of DNA replication [17]. Pioneered by Frederick Sanger, the technique was used to sequence the first DNA genome, the bacteriophage ϕX174, often used as a positive control genome in sequencing labs around the world since its completion [17,18]. Another notable invention from that time period is polymerase chain reaction (PCR), a technique for generating millions of copies of a specific section of DNA [19]. Invented by Kary Mullis in 1983, the technique has since been employed for a variety of applications, including decoding the human genome, preserving animals and coral reefs, and, most recently, detecting COVID-19 [20].
The early 21st century saw the rise of the next-generation DNA sequencing (NGS), when booming sequencing companies were hosting their personalized technologies, with the initial “big three” platforms being Roche/454, Life Technologies/SOLiD, and Illumina/Solexa. In contrast to the Sanger method, which only allows for the sequencing of a single DNA fragment at a time, NGS can sequence millions of fragments in a single run [21]. It is worth noting that the sequencing cost has dropped dramatically over the years since the invention of automated sequencing protocols. Take the human genome, for example. Sequencing costs have decreased from approximately USD 3 billion for the first 3.2 gigabyte genome in 2002 to as low as about USD 1000 for a genome today. The first next-generation sequencer, the 454 system, was introduced in 2005 by Jonathan Rothberg and his colleagues who demonstrated how the genome of a parasitic bacterium Mycoplasma genitalium was sequenced in a single run utilizing the emulsion PCR technique. Likewise, the team sequenced the genome of James Watson, piloting the prevalent personalized genomics [22]. Since then, several sequencers have been developed, including the Solexa 1G (or Genome Analyzer 1), HiSeq 2000, and ultra-high-throughput systems like the HiSeq 4000 and NovaSeq 6000 [22,23].
During the 2010s, more far-reaching sequencing technologies have been developed, from semiconductor chips to nanoballs, all of which provide variable impacts upon what studies are feasible and on the market at large [18,24]. The advent of gene-editing technologies, such as clustered regularly interspaced short palindromic repeats (CRISPR), has transformed many fields in the 21st century, particularly medicine. One recent example is the development of the rapid and accurate CRISPR-Cas12-based detection of the betacoronavirus severe acute respiratory syndrome (SARS)-CoV-2, which has caused over one million deaths worldwide since the outbreak began in December 2019 [19,25]. Gene editing is not a new phenomenon; techniques for editing and knocking out genes have been available since the 1980s, when gene-editing technology was initially developed and introduced [26]. The use of genome editing has expanded the potential of therapeutic technologies, with induced pluripotent stem cells (iPSCs) being a recent example, which generate new models and treatments for a variety of disorders, including Alzheimer’s and Parkinson’s diseases [27].

2. How Has the Cracking of Genetic Code Improved Life on Earth?

Life has existed on Earth for approximately four billion years, albeit the genetic code was only decoded in the 1960s after 99% of human history has been documented [28]. Within the last half-century the world has witnessed tremendous discoveries in all critical areas of life sciences, from medicine to agriculture. The human genome was completed in the early 2000s, around the same time as a handful of model organisms, including arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and fruit fly (Drosophila melanogaster) (Figure 1). After the birth of next-generation sequencing (NGS) technologies in the mid-2000s, hundreds of other organisms had their genomes completely sequenced, and millions of genes have been annotated, be they genes responsible for severe diseases in humans or genes conferring resistance and tolerance in crops [29]. This is especially true for microorganisms with small genome sequences, such as viruses and single-celled organisms (bacteria and protozoa), where the number of genomes being sequenced for these organisms has been exploding [30]. Both partial and complete COVID-19 genome sequences were obtained in the first two months of the epidemic [31]. However, like all other viruses, the SARS-CoV-2 undergoes mutation or small changes in its genome, demonstrating that the virus is evolving [32]. To allow the human genetics community to share important outcomes of the genetic determinants of COVID-19 susceptibility and severity, the COVID-19 Host Genetics Initiative [33] was established in spring 2020 and the genetic association results of several gene clusters (such as TYK2, DPP9, and the OAS1/2/3) were publicly released in January 2021 [34].
Although the sequencing for multicellular organisms has slightly lagged behind, a handful of massive sequencing projects are actively ongoing, including the Earth BioGenome Project that aims to sequence the genomes of all 1.5 million known eukaryotic species, and also the Darwin Tree of Life Project, which seeks to obtain the code of 66,000 species of sequence from every animal, plant, and fungus in the United Kingdom over the course of a decade [35,36]. To date, there are more than 16,000 sequenced genomes of eukaryotes available in the public domain (https://www.ncbi.nlm.nih.gov). Table 1 presents some examples of well-annotated genomes of multicellular organisms evolved in the primary eukaryotic kingdoms since the 2000s.
Table 1. Examples of multicellular organisms with well-annotated genomes.
An organism’s genetic code is made up of merely four bases—A, C, G, and T, but just a change in a single base, frequently known as single nucleotide polymorphism, among thousands of bases can potentially lead to changes in protein structures and functions, impacting one or various traits of an organism. The abnormal changes in the DNA of a gene are termed gene mutations, which may have little to no noticeable effects, or can considerably affect cells in numerous ways [86]. Some mutations cause a gene to be turned on, making more of the protein than usual, while a small percentage of mutations has been found to cause genetic disorders [86,87]. For instance, a mutated version of the beta-globin gene that helps make haemoglobin causes sickle cell anaemia [88]. Recently, a robust sequence-resolved benchmark set for detection of both false positive and false negative germline large insertions and deletions has been developed [89]. The modern genetic modification, interchangeably known as genetic engineering, is the process of altering the genetic makeup of an organism using recombinant DNA (rDNA) technology. A gene (sometimes two or more) from a species is isolated, spliced into a vector with the aid of restriction enzymes, and then introduced into the host species, creating a “transgenic” organism, called a genetically modified organism (GMO), with desirable characteristics [1]. The first genetically engineered animal (i.e., mouse; M. musculus) and genetically engineered plant (i.e., tobacco; Nicotiana tabacum) were produced in 1974 and 1983, respectively. The inception of genetically engineered technology initially sparked concern from various parties, including governments, scientists, and the media, over its potential adverse effects in human health or ecosystems worldwide [90]. With the establishment of a safe and practical guide to rDNA research in 1975, the technology has continued to advance rapidly, impacting medicine, agriculture, and biodiversity [91,92].

2.1. Medicine

Since the completion of the Human Genome Project, multitudinous pieces of research have been underway into human genetic diseases, with the most common one being cancer. The International Cancer Genome Consortium (ICGC) was launched in 2008 to generate genetic data for about 50 most common cancer types and/or subtypes across the globe (https://icgc.org). In 2006, treatments targeting specific molecular abnormalities were made available for certain types of cancer, such as melanoma [93] and lung cancer [94], making some of these chronic illnesses manageable and possibly curable. This was made possible when Druker et al. [95] developed imatinib (or Gleevec), a drug with high efficiency in treating chronic myelogenous leukaemia by targeting the unique molecular abnormality. To date, more than 1000 human genetic tests are practicable, and some enable embryos created from in vitro fertilization to be screened for the genetic mutations that cause genetic disorders such as sickle cell disease and cystic fibrosis [96,97].
Gene therapy, which targets faulty or missing genes to treat disease, is at the forefront of modern medicine [98]. While gene therapy is currently being tested only for terminal diseases like haemophilia and AIDS, this innovation has shown promising progress during the past two decades, with a handful of notable successes including treatments for the X-linked severe combined immunodeficiency and the inherited blindness Leber’s congenital amaurosis 2, which are caused by mutations in the interleukin-2 receptor γ chain (IL2RG) and the retinal pigment epithelium-specific 65 kDa protein (RPE65) genes, respectively [99,100]. Gene therapy trials, however, can raise the risk of severe side effects which can lead to death [101,102], and this still-evolving molecular medicine may require many more years of testing to be proven effective and safe for most conditions [98,103].
Different types of gene-targeting vectors have been designed to elucidate gene function in vivo, from point mutations and insertions to gene deletions [104]. Before individual genome sequencing becomes routine, DNA (or gene) chips can be considered as one of the critical pharmacogenetics technologies. Featuring a tiny DNA microarray, gene chips reveal the level of activation of particular genes and assess a patient’s genetic suitability for certain drugs [105]. Technological advancements in sequencing have facilitated the integration of pharmacogenetics (or pharmacogenomics) in clinical diagnostics, allowing doctors to prescribe medication based mainly on their individual patient’s genetics rather than factors like age and body mass [106]. With the information on how genes influence the response of different individuals on the same drug or medication, treatments can be selected more accurately, and the significant side effects of a specific drug on certain individuals can be avoided [107,108]. Precision approaches are promising in protecting population health and addressing global health landscape challenges (such as the spread of SARS-CoV-2 infections), but they should be complementing rather than replacing the efforts to strengthen public health infrastructure [108]. Recently, several genetic markers associated with SARS-CoV-2 and COVID-19 disease severity have been identified, including FOXP4 and TYK2, which are linked to lung cancer and autoimmune diseases, respectively [33].

2.2. Agriculture

The ancient genetic modification that involved mainly selective breeding and artificial selection occurred more than 32,000 years ago, and the first artificially selected organism was thought to be the domesticated dog (Canis lupus familiaris), the descendant of grey wolf (Canis lupus) [109]. Although the early utilization of the genetically engineered technology ranged from drug discovery to the production of biorenewables, the most controversial application of the technology was and perhaps is for food production. The first crop to be genetically altered was the tomato (Solanum lycopersicum). The product Flavr Savr received marketing approval from the United States Department of Agriculture (USDA) in 1994 after years of extensive field experiments and health testing. The Flavr Savr was created by introducing a reverse-orientation copy of the polygalacturonase gene, which suppresses/shuts down the formation of the polygalacturonase enzyme that dissolves cell-wall pectin in conventional tomatoes, allowing the crop to stay firm longer after harvest [110].
Apart from extending the shelf life of food, the genetically engineered technology has been used to produce pesticide-resistant (or tolerant) plants that are easier to manage and cultivate. Two remarkable examples are Bt maize (Zea mays) and Bt cotton (Gossypium hirsutum) which have become the predominant varieties grown in the United States since their establishment in the mid-1990s. The genes of delta-endotoxins (δ-endotoxins) from the soil bacterium Bacillus thuringiensis encode Cry proteins which are specifically toxic to certain insect orders such as Lepidoptera, Diptera, and Coleoptera [111]. Nonetheless, Bt plants have been reported to be highly vulnerable to certain insect pests that proliferate in some countries such as India, making farming in these countries more capital-intensive [112]. Genetically engineered herbicide-resistant crops have also been created to control unwanted plants in fields efficiently, with the most eminent examples being the glyphosate (N-(phosphonomethyl)glycine)-resistant crops [113].
Biofortification through rDNA or metabolic technology has also been attempted to increase the nutrition value of some staple crops [114]. Golden rice, for example, was developed to combat vitamin A deficiencies in developing countries [115]. In nature, the machinery to synthesize beta (β)-carotene (provitamin A) is fully active in rice leaves but partially turned off in its grains. The pathway is turned back on in golden rice with the addition of two genes encoding phytoene synthase (psy) and carotene desaturase (crtI) via genetic engineering, allowing β-carotene to accumulate in the grains [116]. In July 2021, the Philippines became the first country to approve the commercial production of the golden rice. It was recently reported that multiple biofortification traits (such as high provitamin A, high iron, and high zinc) can be introduced through metabolic engineering via transgenic technology. However, there have been no reported examples of sufficient nutrient enhancement through genome-editing approaches to date, and the combination of genetic engineering and conventional breeding is considered the most powerful approach when aiming at multi-nutrient crops [114].
The genetics of the first transgenic animal were successfully altered way back in 1986 by inserting a portion of the SV40 virus and herpes simplex virus (HSV) gene which encodes thymidine kinase (TK) into an early-stage mouse (M. musculus) embryo to develop cancer [117]. Since then, OncoMice have been a typical model organism in clinical studies. The latest tools to create transgenic animals for human disease studies, including CRISPR/Cas9 systems and transcription activator-like effector nuclease (TALEN), are summarised in Volobueva et al. [118]. Many species of animals have been genetically engineered to fend off pollution and starvation, from transgenic pigs (Sus scrofa domesticus) capable of producing an environmentally friendly form of manure to transgenic salmon (Salmo salar) that grow to a marketable size within 1.5 instead of 3 years [119]. Presently, there have been no genetically engineered animals approved to enter the human food chain, although the first biopharmaceutical product produced by genetically engineered goats (Capra aegagrus hircus) called the ATryn, an anticoagulant to treat a rare blood clotting disorder, was approved in 2009 [120,121].
The world may be on the brink of agreeing on the production of the first genetically engineered animal for human consumption, but the debate surrounding controversy in both animal transgenic and cloning technologies will live on. This is the case for the far-famed Dolly the Sheep (Ovis aries), the first mammal successfully cloned following somatic cell nuclear transfer from an established cell line [122]. Dolly was revealed to the world in 1996 and her death six years later was as controversial as her life when she only managed to live for about half of the life expectancy for sheep [123]. The announcement of the birth of the first gene-edited, cloned macaque monkeys (Macaca fascicularis) using CRISPR for Brain and Muscle ARNT-Like 1 (BMAL1) knockout has sparked outrage from animal welfare advocates and researchers around the globe [124,125]. Nonetheless, genetically engineered technologies can be beneficial if they are done right. A recent study reported several genetically engineered mouse models that may be useful for SARS-CoV-2 research to combat COVID-19 [126].

2.3. Biodiversity

The successful development of genetically modified bacteria (E. coli), more specifically GE insulin in 1973, marked the first breakthrough of the technology in the field of medicine [127], leading to the production of the diabetes drug Humulin. In the 1980s, several other bacteria were being genetically engineered. One notable example is the genetically modified Pseudomonas putida, which can help in oil spill mitigation with its ability to break down multiple components of crude oil [128]. During the last decade, genetically modified bacteria have been used to produce various bio-based products, including recombinantly produced chymosin (or rennin) in cheese production and the first-generation bioplastics and biofuels [129,130]. The genetic manipulation of the model cyanobacterium Synechocyctis sp. PCC6803, for example, has led to an increase in production of bioplastic polyhydroxybutyrate through overexpressions of Rre37 and SigE; the two major proteins involved in polyhydroxybutyrate synthesis [131]. More recently, the most abundant polyester plastic polyethylene terephthalate was successfully engineered to break down and recycle bottles [132].
Climate change, one of the defining issues of the 21st century, is predicted to be the cause of extinction for up to 40% of existing species in the next 30 years [133,134]. Biodiversity, as well as evolution and conservation, are becoming increasingly important as a result of climate change and habitat loss that can lead to extinction [135,136]. Facilitated adaptation, where gene variants from a well-adapted population are transferred into the genomes of threatened populations of either the same or different species, has been set forth to mitigate maladaptation and avert extinction. Nevertheless, this intervention may benefit only certain species and carries its own set of challenges and complications [137]. While some reports stated that climate-related local extinctions have occurred in hundreds of species, the equivalent number of species have merged to survive at their warm-edge range. This implies that genetic adaptations or phenotypic plasticity may enable some populations to tolerate warmer conditions [138]. Intraspecific adaptations should, therefore, be taken into account when assessing species’ vulnerability to climate change [139], though the prevailing issue is the absence of robust methodologies that fully allow the incorporation of genomic information in projecting species responses to a changing climate and in strategizing conservation plans [136,140]. Some species may survive climate change by either dispersing or niche shifts, or both [140,141].
International conservation policy recognises three levels of biodiversity: genetic, species, and ecosystem, all of which should be retained by conservation management [142]. Over the past three decades, many of the genetic, ecological, and geographical factors that contribute to species speciation have been well established, mainly due to the maturation of both theoretical and empirical speciation research [143]. One recent example is the study on the dynamics of explosive diversification and accumulation of species diversity based on the assembly of 100 cichlid genomes [144]. The rapid succession of speciation events within explosive adaptive radiation was reported to depend primarily on the exceptional genomic potential of the cichlids, which is driven by the high density of ancient indel polymorphisms that are mostly linked to ecological divergence [144]. Nonetheless, it is worth noting that the loss of genetic diversity in both terrestrial and marine ecosystems has accelerated during the last few decades, spurred largely by anthropogenic activities such as agriculture and industry [145]. Maintaining resilience, community function, evolutionary potential, and adaptive capacity in these ecosystems through the maintenance of genetic diversity is among the central components of the Sustainable Development Goals (SDGs), including SDG 14 Life Below Water and SDG 15 Life On Land. According to Jung et al. [136], spatial guidance is required to determine which land areas can potentially generate the greatest synergies between biodiversity conservation and nature’s contributions to humanity in order to support goal setting, strategies, and action plans for the biodiversity and climate conventions.

This entry is adapted from the peer-reviewed paper 10.3390/ijms23073976

This entry is offline, you can click here to edit this entry!
Video Production Service