Genome editing is the technique of precise genome modifications that facilitate the targeted mutations within the genome
[14] through the deletions, insertions, or substitution of single base or specific sequences
[15][16]. The precursor of genome editing of plants dates back to the 1970s with the development of genetic engineering, in which the genome manipulation was carried out through the random introduction of specific gene sequences via homologous recombination (HR), and leads to the inactivation or ‘knock out’ of the targeted gene function. Further, the discovery of meganucleases during the 1980s improved the process of targeted genome engineering. All these discoveries led to the evolution of genome editing technologies which have been growing at a rapid pace over the past 10 years and have been established as an extraordinary genome engineering tool
[17][18]. Genome editing can be performed both
in vitro and
in vivo [19] via
in situ delivery of editing machinery, and the highly targeted genome alterations take place through the double-stranded DNA breaks (DSBs) by sequence-specific nucleases, followed by repairing either through non-homologous end-joining (NHEJ) or homologous recombination (HR)/homology-directed repair (HDR), depending on cellular types
[20][21].
In the NHEJ mechanism, the broken ends are re-attached with the deletion or insertion of nucleotide sequences of varying lengths, which leads to the disruption of gene function
[22], whereas, in the case of HDR, a homologous stretch of nucleotide sequences is introduced into the donor template that leads to more accurate repair with specific alterations of genomic sequences
[23]. As the repairing in HDR is mediated with the help of a donor template, it is slower and less frequent than NHEJ
[24], thus, the choice of HDR-mediated repair in plants is very difficult
[25][26]. To create a gene knockout mutant via insertion/deletion or gene replacement, various sequence-specific nucleases,
viz., zinc-finger nuclease (ZFN), TALENs, and CRISPR-associated proteins (Cas9, Cas12), can be employed. These nucleases are discovered through the groundbreaking work in bacteria, yeast, and mammalian systems, but are also applicable in a wide variety of crop plants for their trait improvement
[27]. The details about these nucleases are highlighted in the below subsections.
2.1. Zinc Finger Nucleases (ZFNs)
ZFNs are an engineered protein consisting of a zinc finger domain at the N-terminal with an endonuclease domain at the C-terminal end
[28][29]. The zinc finger domain is necessary for the specific recognition of the targeted DNA sequence, and the endonuclease domain of the FokI restriction enzyme (RE), isolated from the
Flavobacterium okeanokoites, ensures the cleavage of the specific DNA sequences
[30]. For its functionality, heterodimerization of FokI RE is indispensable; hence, two ZFNs must dimerize for binding both strands of DNA and to align FokI domains. ZFN contains a tandem array of three to six zinc fingers (Cys
2His
2), each recognizing approximately 3 bp of DNA
[31]. The sequence-specific binding of the zinc-finger domain directs the nuclease to cleave a specific genomic site. This mechanism was exploited for designing ZFN mediated gene-editing tools that are extensively used for customized engineering of the genome in many organisms
[32].
The breakthrough of ZFNs as programmable nuclease was initiated in mice to create gene knockout via DSBs of target sequences that rapidly disseminated in various laboratories
[29]. Further, it was expanded in agriculture for crop improvement, but with restricted implications for genomic editing attempts in limited crops such as
Arabidopsis, tobacco, and maize
[33]. Further, the off-target binding of the ZF motifs, other than the target sequence, makes them inefficient as an editing tool
[34]. Moreover, the designing of a ZFN molecule via protein engineering is very challenging and highly time-consuming; thus, it will not be cost-effective to create a particular mutation.
Table 1. Tabular comparison of major genome editing technologies available for plants improvement.
Attributes
|
ZFNs
|
TALENs
|
CRISPR/Cas9
|
Cleavage type
|
Protein-dependent
|
Protein-dependent
|
RNA-dependent
|
Size
|
Significantly smaller than Cas9
(+)
|
Comparatively larger than ZFNs
(++)
|
Significantly larger than both ZFNs and TALENS
(+++)
|
Components
|
Zinc finger domains, Non-specific FokI nuclease domain
|
TALE DNA-binding domains, Non-specific FokI nuclease domain
|
Cas9 protein, crRNAs
|
Catalytic domain(s)
|
FokI endonuclease domain
|
FokI endonuclease domain
|
HNH, RUVC
|
Structural components (Dimeric/Monomeric)
|
Dimeric
|
Dimeric
|
Monomeric
|
Target sequence length
|
18-36
|
24-59
|
20-22
|
gRNA production required
|
No
|
No
|
Yes
|
Cloning required
|
Yes
|
Yes
|
No
|
Protein engineering steps needed
|
Yes
|
Yes
|
No
|
Mode of action
|
Induce DSBs in target DNA
|
Induce DSBs in target DNA
|
Induce DSBs or single-strand DNA nicks in target DNA
|
Restriction target site
|
High G
|
5’T and 3’A
|
PAM sequence
|
Level of target recognition efficiency
|
High
|
High
|
Very high
|
Targeting
|
Poor
|
Good
|
Very good
|
Mutation rate level
|
High
|
Low
|
Very low
|
Off-target effects
|
Yes
|
Yes
|
Yes, but can be minimized by selection of unique crRNA sequence
|
Cleavage of methylated DNA possible
|
No
|
No
|
Yes, but it will be explored more
|
Multiplexing enabled
|
Highly difficult
|
Highly difficult
|
Yes
|
Labour intensiveness in experiment setup
|
Yes
|
Yes
|
No
|
Possible to generate large scale libraries
|
No
|
Yes, but it is highly challenging
|
Yes
|
Design feasibility
|
Difficult
|
Difficult
|
Easy
|
Technology cost
|
Very high
(£1000-£3000)
|
High
(£40-£350)
|
Comparatively low
(£30-£300)
|
2.2. TALENs
For many years, ZFN was explored as the only programmable site-specific nuclease, but it has been out-paced with the discovery of a DNA binding effector protein, called transcriptional activator-like effector (TALE), isolated from plant-pathogenic bacteria,
Xanthomonas [35][36]. It primarily acts as the transcriptional regulator of the disease susceptibility (
S) genes in rice. This protein is characterized by the C-terminal activation domain (AD) and nuclear localization signal (NLS) required for transcriptional regulation, the central tandem repeat sequence acting as a DNA binding domain (DBD), and the N-terminal translocation signal sequence
[37]. A series of 33–35 amino-acid long repeat sequences in the DBD is present; of them, two hypervariable amino acids at the 12th and 13th position, also known as the repeat-variable di-residues (RVDs), are responsible for the specific recognition of nucleotide bases
[38]. The sequence-specific binding property of DBD is exploited for the further development of new gene-editing technology,
i.e., transcription activator-like effector nucleases (TALEN).
Similar to ZFN, TALENs are customized by fusing the DBD of the transcriptional activator-like effector (TALE) with the FokI restriction enzyme
[39] (
Table 1). However, unlike ZFN, the designing of TALEN is much easier, as the repeat sequence of the TALEs has specificity for targeting single sites in a genome. Further, the multimerization of the repeat sequence is not essential for the construction of a long array of DBD, as in ZFN; hence the engineering is quite easy and less time-consuming
[40]. The identification of RVDs in repeat regions of TALEs helps in the recognition of their specificity for various binding targets, as each RVD has a single nucleotide target, thus allowing the flexibility for designing TALENs for a greater number of potential target sites than that of ZFNs. Therefore, the TALENs have been utilized for genome editing in a wide variety of plants. Additionally, the binding of TALEs with gene activators and receptors, apart from nuclease, leads to the formation of efficient artificial transcriptional regulators to achieve the desirable gene regulation. Despite the advantages of TALENs over ZFNs in terms of high target specificity and low off-target effect, the extensive repeat structure in the DBD of TALE protein becomes the major limiting factor for their use in target-specific editing of multiple genomes, and further, protein engineering is always tedious. To resolve these issues, genome editing using programmable RNA-guided DNA endonucleases has become more popular.
2.3. CRISPR/Cas System
The CRISPR (Clustered regularly interspaced short palindromic repeats) is a mysterious DNA sequence found in the prokaryotic genome (including bacteria and archaea) which is often palindromic, consisting of 29 nucleotides (nt), long identical tandem repeats separated by a unique spacer (32 nt in length)
[41]. It is a kind of locus consisting of several CRISPR-associated conserved protein-coding genes (
Cas) exclusively involved in the adaptive immunity of prokaryotes against bacteriophages. This CRISPR-mediated immunity is functionally related to eukaryotic RNA interference (RNAi)
[42], with the additional advantage of the development of genetic memory from past encounters. The CRISPR can recognize the small CRISPR RNAs (crRNAs) transcribed from the genetic memory (acquired in the CRISPR repeats) and use these small guide RNAs to cleave the virus genome
[43][44]. This mechanism of CRISPR is explored and exploited further. The programmable nature of the CRISPR system led to the design of the RNA-guided DNA endonucleases-based genome editing tool, which is popularly known as CRISPR/Cas.
The CRISPR/Cas-based genome editing works based on the RNA:DNA base-pairing principle to target the host DNA and is used as a novel system for precise genome manipulation in many organisms, including plants
[45]. This technique is more robust and simpler compared to ZFN and TALEN
[46]. It is inexpensive, easy to apply, and has high versatility with great accuracy, even when deployed for multiplex genome editing,
i.e., for the manipulation of multiple genes at the same time
[47]. It has been showcased in various model plants (
Arabidopsis, tobacco, etc.) and crop plants (rice, wheat, maize, tomato, potato, and soybean) as well as woody plants (apple, poplar, etc.) for durable trait improvements, from achieving higher yield and quality to alleviating biotic and abiotic stress troubles
[48][49][50].
This CRISPR/Cas system involves the creation of dsDNA breaks at a desirable specific site in the genome with the help of a guide RNA (20–23 nt long), which is designed to be complementary to the target sequences and binds with the one strand of genomic DNA, using Watson–Crick base pairing to facilitate the Cas endonucleases’ mediated cleavage of dsDNA
[51]. The DSBs are then mended by the cellular repair machinery, involving HDR or NHEJ mechanism, and generate genomic modifications such as mutation via deletion and insertion
[52].
Interestingly, there are different Cas endonucleases (class 1 and class 2) that vary in their structure, composition, functional targets. Out of them, class 2 of Cas endonucleases is most commonly used in genome editing (Table 2), and these nucleases are optimized for their wide-scale application in genome editing. With great functional variation in terms of specificity and nucleic acid target, they are extended for precise editing of both DNA and RNA. Further, improvisation has led to the emergence of some innovative techniques such as base editing, prime editing, etc. See Table 2 for more precise editing of single/few nucleotides. The utility of these techniques for the improvement of plant biotic stress tolerance has opened up a new avenue in rice improvement (Figure 1).
3. Future directions
In the present decennium, crop improvement is considered as a prime way for calorific and nutritional demands of every-second increasing mankind. However, various methods like hybridization, somaclonal variation, in vitro tissue culture, and mutagenesis require more manpower, time duration, efforts along with a high chance of failures in getting the “desirable traits”. The development of novel tools and technologies is indispensable for scientific advancement. In this regard, methodological advancements in the unprecedented toolbox of genome/gene-editing technology (including CRISPR/Cas9) offer immense opportunities for transforming agriculture science under the changing climate. This also offers unlimited potential for improving existing crops in a short time and de novo domesticating new crops in the fast and forward direction. Among all these technologies, Programmable nucleases are at the epicentre of the explosive growth of the genome-editing field. Within these nucleases, CRISPR-associated protein 9 (Cas9), Cas12a nucleases and their derivatives have been most precise, easy to handle, and also employed for avoiding backcrossing of a huge number of inbred lines. The new CRISPR/Cas systems may enable researchers to overcome the limitations of Protospacer Adjacent Motif sequences, target specificities, large protein size of Cas9, and multi-genes editing. In addition, CRISPR/Cas based cis-regulatory element sequences alternation holds a great promise for the development of future-ready crops. Single-nucleotide polymorphism (SNPs) and microRNA (mRNA) both play a very important role in gene expression and their functions in a plant which are closely associated with many agronomic traits. The latest base editing and prime editing tools offer a wider application of CRISPR technologies to create SNPs in the genome and alteration of microRNA binding genomic regions thus altering the function of gene and regulatory region which can be implemented to create improved crop varieties. Besides generating modifications in target genes, CRISPR/Cas can also be used to restructure and engineer chromosomes. Introduction of duplications, inversions of large regions within a chromosome, or translocations between chromosomes, can lead to the breakage of linkages, providing useful genetic materials for crop breeding. Taking these points in a frame together, researchers can affirm those genome editing technologies holds the promise of improving crops.
Figure 1. Schematic illustration representing the genome editing strategies adopted for the enhancement of disease resistance in rice. Different approaches are functional knockout of the host S genes via either deletion/mutation/replace of the coding sequences, modification of single nucleotide polymorphisms in the recessive allele of R genes through base editing, targeted modification of non-coding regulatory RNAs (miRNAs) by Cas13, replacement of central regulators like NPR1, engineering the promoter cis-elements by multiplex editing, modification of negative TFs of plant defense, metabolic engineering of secondary metabolite (lignin) synthesis to favor plant defense, etc. These strategies have been applied to intervene in the different steps of pathogenesis events of fungi, bacteria, and viruses.
Table 2. Advancements in genome editing tools and their variants are available for rice improvement.
Genome Editing Tools |
Variants |
Features |
Function |
Application |
References |
ZFN |
- |
Engineered protein, containing a zinc finger domain and endonuclease domain |
Protein-dependent cleavage of any genomic DNA sequence |
Editing of any DNA sequence |
[53] |
TALEN |
- |
Customized protein-containing DNA binding domain of transcriptional activator-like effector (TALE) and FokI restriction enzyme |
Protein-dependent cleavage of any genomic DNA sequence |
Editing of any DNA sequence |
[39] |
CRISPR/Cas |
Cas9 |
A ribonucleoprotein complex containing a DNA endonuclease (Cas9) enzyme fused with guide RNAs (gRNAs) |
RNA-guided cleavage of dsDNA sequence complementary to the gRNA; Efficient genome editing with limited target site and potential off-target effect due to long size of sgRNA (100 nt) |
Genome editing with multiplex facility |
[54] |
Cas12 |
A ribonucleoprotein complex containing a DNA endonuclease (Cas12/Cpf1) enzyme fused with crRNA (CRISPR-derived RNA), but not tracrRNA |
RNA-guided cleavage of ssDNA and dsDNA sequence with high cleavage efficiency and less off-target effect due to short crRNA (40--45 nt) molecules |
Precise genome editing |
[55] |
Cas13 |
A ribonucleoprotein complex containing an RNA endonuclease (Cas13) enzyme fused with crRNA |
RNA-guided cleavage of ssRNA sequence; suitable multiplex editing |
Robust management of RNA viruses |
[56] |
Base editing |
A ribonucleoprotein complex containing catalytically inactive Cas9 nickase and a cytidine deaminase domain fused with a single gRNA |
G-C to A-T conversion at desired locations in the genome |
Nucleotide substitutions; with the limitation of limited PAM site and frequent off-target effect |
[57] |
Prime editing |
A ribonucleoprotein complex containing a Cas9 nickase fused with reverse transcriptase (RT) and a prime editing guide RNA (pegRNA) |
Targeted small insertions, deletions, and base transition by ‘search-and-replace’ method using the pegRNA sequence |
Specific nucleotide substitution via gene knock-in at targeted genomic site |
[58] |