1. Genome Editing with Double-Stranded DNA Breaks (DSBs) — CRISPR from Yogurt to Plant Breeding
In 1987, the bacterial genome was sequenced to study the defensive mechanism and found repetitive sequences in the genome, which in 2005 were named as CRISPRs. Furthermore, it was found that the viruses that attack bacteria share some similarities with the sequences present in the bacteria. The matched CRISPR sequences were later confirmed by Danisco while studying the defensive mechanism of yogurt bacteria that survived against the viral attack
[1]. The CRISPR mechanism was then studied in detail and the Cas genes associated with it were found. CRISPR loci are surrounded by different
Cas genes and repetitive sequences, and are interspaced by variable sequences (spacers), which correspond to the sequences present in foreign genetic elements called protospacers.
Cas genes translate themselves into proteins and degrade the genome of foreign genetic elements such as viruses
[2]. The Cas genes were also identified as having the ability to cut the DNA by encoding domains of proteins as explained in
[3][4]. These associated genes serve as the basis for classifying CRISPR into three different types (I, II, III) (
Figure 1)
[5]. Each of these three types are distinguished by the presence of specific genes: Cas3 gene in type I; Cas9 gene in type II; Cas10 in type III. Type I and III have different Cas proteins that also form complexes with crRNA (CRISPR-RNA) to assist the target nucleic acids’ identification and destruction
[6]. Type II has a smaller number of Cas proteins and their biological importance is still elusive
[7]. Moreover, type II is the most commonly used due to the high accuracy in cutting and generating DNA and crRNA, respectively. It consists of two domains, RuvC and HNH, that are responsible for the DSBs of targeted DNA, hence making this type more precise and carrying out genetic engineering at a very low cost
[8]. The CRISPR/Cas system has widely shown its role in all living organisms. The human genome has also been edited using CRISPR technology to knock out genetic diseases. Recently, CRISPR has been used to study the viral infection COVID-19. The high sensitivity and specificity of CRISPR has the ability to detect variation in even a single nucleotide, which leads this system to be considered as more reliable and efficient in detecting viral diseases in humans. Moreover, CRISPR has been considered as a major advancement in plant improvement either by improving crop yield, resistance to biotic and abiotic stresses, or diversity in plant species
[9][10].
Figure 1. Schematic mechanism of bacterial CRISPR system as a defensive tool to degrade the viral genome. Step 1: During the invasion, foreign genetic material (viral genome) enters in bacterial genome. Step 2: (Integration of spacers); spacers are inserted into the genome (shown in yellow color) and this is memorized by bacteria to recognize in case of future invasion. Step 3: (CRISPR-RNA formation and processing); CRISPR array is a noncoding part that is maturated during this step and works only according to a specific CRISPR system mentioned in figure. In CRISPR type I and III, associated ribonucleases in CRISPR work to cleave the pre crRNA between the repeats and liberate many short crRNAs. System III-associated crRNA goes through a further process at 3′end by employing the RNases that are yet to be identified and produce maturated RNA transcript. Step 4: (Destruction of target genome); for the recognition and destruction of the target sites, type I and III have several complexes of proteins with crRNAs. The cascade complex is present in type I, and Csm and Cmr complexes are present in type III for DNA and RNA cleavage, respectively. The cas3 nuclease bounded with the R-loop facilitates the process in type I, whereas type II has fewer proteins and cas9 is required for degradation. Protospacer adjacent motifs (PAMs) in type II facilitate the cas9 in identifying the target sites. In both I and II types, self-targeting of CRISPR is prevented due to the lack of PAM in the targeted sequences.
2. CRISPR/Cas9 and Cpf1 in Genome Editing
The development of the CRISPR/Cas9 mechanism (
Figure 1) for the improvement of crops is based on the bacterial defensive mechanism. While CRISPR/Cas9 functioning is performed in three steps: (1) Acquisition: acquisition of spacer DNA from the viral DNA or resident plasmids is required due to the presence of DSBs, which results in insertion in the bacterial genome (to memorize the invading viral DNA); (2) Expression: expression of crRNA from the transcription of the CRISPR array, which also involves the expression of the Cas9 protein; (3) Interference: crRNA acts as a guide RNA, which is further directed by Cas9 protein to bind at targeted DNA that is accompanied by PAM sites and cuts the specified DNA three-nucleotide away from PAM sites at both DNA strands
[11].
In plants, the CRISPR/Cas9 system edits plants’ genomes by employing various components, including Cas9 protein and sgRNA. Initially, sgRNA is designed in silico, which is an amalgamation of crRNA and tracrRNA. However, thanks to bioinformaticians, many online algorithm-based software and websites are available to design the very specific and precise sgRNA, for example, CRISPR-P, CHOPCHOP, etc.
[12][13]. It is compulsory to construct both expression cassettes of Cas9 and sgRNA separately. Small nuclear RNA gene promoters U3 or U6 are used for the transcription of sgRNA by using RNA polymerase 3 and defining the initiation and termination site.
For a successful cleavage of specified sites, sgRNA and targeted DNA sequences should be matched, except for the first nucleotide (5′G or A). During the Cas9 expression and its nuclear localization purpose, single or dual NLS (nucleic localization signal) is fused with the Cas9 coding sequence (4107-bp). Both Cas9 and sgRNA expression cassettes are assembled in vectors to perform further genome editing procedures. Before conducting a final genome editing step, protoplasts are transformed with the CRISPR to analyze and validate the sgRNA activity
[14]. Next, a PCR or restriction enzyme digestion step is employed to select the active CRISPR. The final vector contains the CRISPR/Cas9 setup, which is transformed in the plant cells via
Agrobacterium-mediated transformation or a particle bombardment procedure
[15]. After transformation in a plant cell, the following steps are carried out: the activation of Cas9 proteins, cleavage at targeted sites, and production of DSBs. The activation step involves the gRNA activating the Cas9 protein. Without the binding of gRNA, Cas9 protein is nonfunctional. Bacteria (
Streptococcus pyrogens) have a protein named Cas9 (originally called SpCas9), which is widely used in plants and has the uniqueness to recognize the NGG type PAM site.
The CRISPR/Cas9 technique (
Figure 2) is being continuously improved for efficient genome editing. CRISPR is categorized into two classes based on the effector molecules they have: class 1 and class 2. Class 1 contains multiple subunits of effector molecules containing different Cas proteins, while class 2 contains a single effector protein
[16]. Furthermore, these classes are divided into six subtypes; I, II, III, IV, V, and VI. Class 1 contains type I, III, and IV, while class 2 has type II, V, and VI. These types contain different Cas genes; Cas3 in type I, Cas9 in type II, Cas10 in type III, type IV is a putative subtype, Cas12 in type V, and Cas13 in type VI
[17]. Among these types, type II is the most commonly used due to its high efficiency in genome editing. Although class 1 accounts for 90% of the CRISPR/Cas system, it is less studied and rarely used in genome editing due to its complex system
[18], while class 2 is more abundantly studied and used in genome editing due to the presence of Cas9, Cas12, and Cas13 genes.
Figure 2. Mechanism of CRISPR/Cas9 and Cpf1 to edit the plant’s genome; (
a) is a schematic view of CRISPR/Cas9, and (
b) is of CRISPR/Cpf1. Both GETs are used to edit the plant’s genome. In both GETs, initially, desired DNA and PAM (20 sequences) sites are selected in plants’ genome. Different sgRNA designing bioinformatics tools are available, which gives information about the best gRNA for subsequent GE steps. sgRNA is cloned and vector is constructed to deliver in the genome by using
Agrobacterium tumefaciens-mediated plant transformation. By using a couple of steps, transgenic plants are developed (shown in dotted line box). Further transgenic plants are regenerated and screened by genotyping analysis.
Recently, the type V CRISPR/Cas system has been identified with several subtypes. The main studied types are Cpf1 (Cas12a) as type V-A and C2c1 (Cas12b) as type V-B. Cpf1 is now considered a better substitute for Cas9 due to its efficient version of GETs. CRISPR/Cpf1 (Cas12a) refers to CRISPR from
Prevotella spp. and
Francisella spp. Furthermore, CRISPR/Cpf1 has been adapted more than CRISPR/Cas9 due to its short sgRNA nucleotide length and reduced size of the Cpf1 protein. Its sgRNA only requires shorter crRNA as compared to both crRNA and tracrRNA in the CRISPR/Cas9 mechanism
[19][20]. The sgRNA directs the Cpf1 nuclease to bind at the targeted region upstream of PAM. In comparison to Cas9 protein, Cpf1 prefers T-rich PAMs instead of G and cleaves the targeted DNA at the proximal site of PAM in a staggered fashion to generate blunt ends
[21]. CRISPR/Cpf1 has been used in many plants
[22]. Furthermore, it is necessary to insert or delete the nucleotide sequences for the improvement of crop traits. For this purpose, the natural repairing mechanism of cell machinery is switched on. Generally, HDR and NHEJ nucleotide repairing mechanisms work to insert the nucleotide sequences precisely at the cleavage site or random insertion or deletions
[23].
Recently, CRISPR/Cas12b (C2c1) has been developed, which is a dual RNA-guided endonuclease similar to Cpf1. Cas12b has the ability of temperature inducibility; hence, it can be used for developing plants’ resistance to high temperatures. Cas12b has the longest sticky ends of all the CRISPR systems, producing DNA DSBS with 6–8 nucleotide sticky ends. The size of Cas12b is smaller than Cas9 and Cas12a. Moreover, Cas12b, just like Cas9, needs a crRNA and tracrRNA combined with an sgRNA for DNA targeting
[24].
3. Genome Editing (with DSBs) Role in Cereal’s Genome Improvement
To date, the GETs such as CRISPR/Cas9 and Cpf1 have been used to increase the production and disease resistance of crops as shown in
Table 1. CRISPR/Cas9- and Cpf1-based GETs are more efficient than endonucleases/meganucleases (EMNs), meganucleases (MNs), ZFNs, and TALENS, which were a breakthrough in the agricultural arena to improve plants’ targeted traits with more precision, accuracy, and minimized off-target effects
[1][2][12]. These GETs are very broad to be applicable for the improvement of cereal crops
[25][26][27].
Table 1. Achievements in cereals by using GETs.
| Gene Editing Tool |
Crop |
Targeted Gene |
Targeted Trait |
Reference |
| CRISPR/Cas9 |
Wheat |
TaLOX2 |
Development of grain |
[28] |
| CRISPR/Cas9 |
Maize |
LIG1, Ms26. Ms45, ALS1, and ALS2 |
Chlorsulfuron-resistant |
[29] |
| CRISPR/Cas9 |
Rice |
GS3, GW2, GW5, TGW6, |
Improved grain related parameters |
[30] |
| CRISPR/Cas9 |
Wheat |
Gli-2 loci |
Low-gluten foodstuff |
[31] |
| CRISPR/Cas9 |
Rice |
OsPRX2 |
Improved salt tolerance level |
[32] |
| CRISPR/Cas9 |
Wheat |
TaInox, TaPds |
Chlorophyll synthesis |
[33] |
| CRISPR/Cas9 |
Rice |
Waxy |
Enhanced glutinosity |
[34] |
| CRISPR/Cas9 |
Rice |
Hd2, Hd4, Hd5 |
Early heading |
[35] |
| CRISPR/Cas9 |
Maize |
PPR, RPL |
Reduced zein protein |
[36] |
| CRISPR/Cas9 |
Maize |
ARGOS8 |
Drought tolerance |
[37] |
| CRISPR/Cas9 |
Rice |
OsNAC041 |
Salt tolerant |
[38] |
| CRISPR/Cas9 |
Maize |
ZmHKT1 |
Salt tolerant |
[39] |
| CRISPR/Cas9 |
Rice |
LAZY1 |
Tiller spreading |
[40] |
| CRISPR/Cas9 |
Rice |
Gn1a, GS3, DEP1 |
Enhanced grain number, larger grain size, and dense erect panicles |
[41] |
| CRISPR/Cas9 |
Wheat |
GW2 |
Increased grain weight and protein content |
[42] |
| CRISPR/Cas9 |
Wheat |
TaGASR7, TaGW2, TaDEP1, TdGASR7(durum wheat) |
Grain development, kernel length, storability, and plant height and weight |
[43] |
| CRISPR/Cas9 |
Wheat |
TaGW2, TaGASR7 |
Grain and kernel length and weight |
[44] |
| CRISPR/Cas9 |
Wheat |
α-gliadin, gamma-gliadins |
Gliadins |
[45] |
| CRISPR/Cas9 |
Wheat |
TaLOX2, TaUbil1 |
Grain development |
[46] |
| CRISPR/Cas9 |
Wheat |
TaDREB2,TaERF3 |
Drought signaling |
[47] |
| CRISPR/Cas9 |
Wheat |
TaCER9, TaLOX2,TaGW2 |
Grain development |
[48] |
| CRISPR/Cas9 |
Wheat |
TaGW2, TaLpx-1, TaMLO |
Kernel width and weight; resistance to powdery mildew |
[49] |
| CRISPR/Cas9 |
Wheat |
α-gliadin genes |
Low-gluten wheat |
[31] |
| CRISPR/Cas9 |
Wheat |
TaMs45 |
Male fertility |
[50] |
| CRISPR/Cas9 |
Rice |
OsSWEET13 |
Bacterial blight resistance |
[51] |
| CRISPR/Cas9 |
Rice |
SBEIIb |
High amylose content |
[52] |
| CRISPR/Cas9 |
Wheat |
EDR1 |
Powdery mildew resistance |
[53] |
| CRISPR/Cas9 |
Rice |
OsERF922 |
Enhanced rice blast resistance |
[54] |
| CRISPR/Cas9 |
Rice |
OsSWEET13 |
Bacterial blight resistance |
[54] |
| CRISPR/Cas9 |
Maize |
TMS5 |
Thermosensitive male-sterile |
[55] |
| CRISPR/Cas9 |
Rice |
OsMATL |
Induction of haploid plants |
[56] |
| CRISPR/Cas9 |
Rice |
OsPIN5b and GS3,OsMYB30 |
High yielding and cold tolerance |
[57] |
| CRISPR/Cas9 |
Rice |
ALS |
Herbicide resistance |
[28] |
| CRISPR/Cas9 |
Rice |
LAZY1 |
Tiller spreading phenotype |
[40] |
| CRISPR/Cas9 |
Rice |
Gn1a,DEP1, GS3 |
Number of grains, erect panicles, specific for grain size |
[41] |
| CRISPR/Cas9 |
Rice |
SBEIIb |
High amylose rice |
[52] |
| CRISPR/Cas9 |
Rice |
OsERF922 |
Rice blast resistance |
[51] |
| CRISPR/Cas9 |
Rice |
OsEPSPS |
Glyphosate resistant |
[58] |
| CRISPR/Cas9 |
Rice |
ALS |
Herbicide resistance |
[56] |
| CRISPR/Cas9 |
Rice |
ALS |
Herbicide resistance |
[59] |
| CRISPR/Cas9 |
Rice |
EPSPS |
Herbicide resistance |
[58] |
| CRISPR/Cas9 |
Rice |
ALS |
Herbicide resistance |
[60] |
| CRISPR/Cas9 |
Maize |
ALS |
Herbicide resistance |
[29] |
| CRISPR/Cas9 |
Maize |
ARGOS8 |
Drought stress tolerance |
[61] |
| CRISPR/Cas9 |
Wheat |
TaMLOA1, TaMLOB1,TaMLOD1 |
Resistance to powderyMildew |
[62] |
| CRISPR/Cas9 |
Maize |
PDS, IPK1A, IPK |
Phytic acid content |
[63] |
| CRISPR/Cpf1 |
Rice |
OsEPFL9 |
To regulate the stomatal density in leaf |
[64] |
| CRISPR/Cpf1 |
Rice |
OsROC5 and OsDEP1 |
Editing efficiency was compared on varying temperature |
[43][44] |
| CRISPR/Cpf1 |
Maize |
GL2 |
Editing efficiency was compared on varying temperature |
[65] |
| CRISPR/Cpf1 |
Rice |
DL, ALS, NCED1, AO1 |
Drooping leaf phenotype |
[66] |
| CRISPR/Cpf1 |
Rice |
OsPDS, OsBEL |
Heritable mutations |
[67][68] |
| CRISPR/Cpf1 |
Rice |
OsRLK, OsBEL |
Albino phenotype |
[69] |
| CRISPR/Cpf1 |
Maize |
glossy2 |
Efficiency compared with CRISPR/Cas9 |
[70] |
| CRISPR/Cpf1 |
Rice |
OsPDS, OsGS3 |
Improved the editing efficiency |
[71] |
| CRISPR/Cpf1 |
Rice |
OsDEP1, OsROC5, OsPDS |
Tenfold reduction in miR159b transcription, transcriptional repression |
[72] |
| CRISPR/Cpf1 |
Rice |
DEP1, PDS, EPFL9 |
Efficient editing at all TTTV PAM sites |
[73] |
| TALENs |
Rice |
OsSWEET14 |
Bacterial blight resistance |
[74] |
| TALENs |
Wheat |
TaMLO |
Powdery mildew resistance |
[62] |
| TALENs |
Maize |
ZmGL2 |
Reduced epicuticular wax in leaves |
[75] |
| TALENs |
Rice |
OsBADH2 |
Fragrant rice |
[76] |
| TALENs |
Rice |
DEP1, CKX2, BADH2, SD1 |
Rapid and efficient gene modification in rice |
[77] |
| TALENs |
Maize |
ZmMTL |
Induction of haploid plants |
[78] |
| TALENs |
Maize |
PDS, IPK1A, IPK and MRP4 |
Reduce the phosphorous concentration |
[79] |
| TALEN |
Wheat |
TaMLO |
Powdery mildew resistance |
[62] |
| ZFN |
Maize |
PAT |
Herbicide resistance |
[80] |
| ZFN |
Rice |
OsQQR |
Detection of safe harbor loci herbicide |
[81] |
| ZFNs |
Maize |
ZmIPK1 |
Herbicide tolerant and phytate reduced maize |
[82] |
| ZFNs |
Maize |
ZmTLP |
Trait stacking |
[83] |
| ZFNs |
Rice |
OsQQR |
Trait stacking |
[81] |
| MNs |
Maize |
lg1,ms26 |
Targeted mutation |
[84] |
| MNs |
Maize |
ms26 |
Male sterility |
[85] |
| MNs |
Wheat |
DsRed |
Removed selectable markers |
[86] |
4. Genome Editing without DSBs and Donor Template
CRISPR/Cas9 is a versatile tool used to edit the plant’s genome precisely and with efficacy. Despite its countless services for the betterment of the plant’s genome, it may cause harmful mutations owing to off-target effects. These mutations may leave unpredictable results in the next generations. There are ways to detect these off-target mutations either in vitro or in vivo such as CIRCLE-seq, GUIDE-seq, DISCOVER-seq, SITE-seq, and Digenome-seq
[87]. These mutations are caused by DSBs. However, to cope with the off-target mutations, brave approaches can be used without inducing the DSBs (
Figure 3)
[64][65] to insert the genome at the targeted DNA
[88].
Figure 3. A modified form of a figure from
[64][65], which shows the novel GETs without producing the DSBs; (base editing (
a), epigenetic modification (
b), and prime editing (
c)). In (
a), by using the base editing approach, two genes (
TaALS and
TaACCase) are co-edited. This approach is used by coupling the dCas9 with a cytosine base editor (CBE). In this way, such types of transgenic wheat plants are developed, which did not produce any DSBs. (
b) is epigenetic editing; in this approach, dCas9-Suntag-hTET1cd is coupled with dCas9 for demethylation of the FWA promoter to activate the FWA gene expression. (
c) is prime editing that works by developing a complex interaction between pegRNA, Cas9 nickase-reverse transcriptase (RT), and target DNA. In the pegRNA, except for the primer binding site (PBS), the desired genome sequence is also present, which is introduced in the host genome. For RT, pegRNA produces primer; RT copies the information of pegRNA, and the RT product is integrated with the target genomic site. Initially, modification happens only at one targeted DNA strand. Later, modification is present on both strands due to the cell’s repairing mechanism.
New approaches such as base editing
[89] and prime editing
[90] exploit the Asp10Ala and His840Ala mutations containing the dCas9 protein with other effector proteins to bind at specified genome locations. This dCas9 protein can alter the single base pair without any cleavage in that region
[91]. It has no more nuclease activity but works to guide the sgRNA for binding.
4.1. Base Editing
Genome editing requires gRNA, the Cas9 protein, donor template, and repairing mechanism for the editing of the genome, while base editing uses the reprogrammable deaminase intending to introduce the bases at the targeted sites without any cleavage and induction of DSBs
[92]. For this purpose, the CBE (cytosine base editor) and ABE (adenine base editor) have been developed to alter the C-T and A-G, respectively
[25]. In humans, daily spontaneous hydrolytic deamination causes the conversion of C-T and A-G 500 times per cell
[93]. ABE contains different base editors, including Target-AID and BE. In Target-AID, the pmCDA protein is fused with the dCas9 protein (Cas9n, D10A) to perform base editing. In BE series, the rAPOBEC protein is used for fusing with the dCas9 protein (Cas9n, D10A). CBE is used to alter the C-T, and then T is changed to U in response to the natural repairing mechanism. The CBE genome editing technique has already been used in crops including, tomato, wheat, rice, maize, and
Arabidopsis, while, ABE is used to deaminase A to G, and has been reported in wheat, rice,
Arabidopsis, and
Brassica napus [25]. Its improvement of cereals’ genomes has been discussed in the section “Genome Editing (Without DSBs) Role in Cereals’ Improvement”.
4.2. Epigenetic Editing
Epigenetic refers to the modification of the genome without perturbing the DNA sequences such as histone modification, DNA methylation, DNA demethylation, gene imprinting, chromatin remodeling, etc.
[94]. These epigenetic modifications are common in plants
[95]. Nature has blessed plants with a specialized mechanism of epigenome editing to protect against various kinds of biotic and abiotic stresses
[96]. CRISPR/Cas’s component Cas9 protein is exploited in the form of dCas9 for the genome modification. Protein dCas9 is fused with the epigenetic modifier for the targeted modification, which results in the alteration of gene expression
[91]. For example, Gallego-Bartolomé and his colleague worked to modify the plant’s genome epigenetically by involving DNA demethylation/methylation resulting in targeted DNA methylation, and a late flowering phenotype was developed
[91]. These epigenetic modifications are also maintained in the next segregates. However, a lot of work is needed to explore this technology in all other cereals.
4.3. Prime Editing
Prime editing is also a new genome editing technique that utilizes the Cas9 nickase amalgamated with a PE guide RNA (pegRNA) to edit the genome precisely by a “search and replace mechanism”
[97]. In the CRISPR/Cas9 mechanism, DSBs are generated that are associated with some complex off-target effects, including p53 activation and translocations
[98]. Prime editing technology was developed first by Liu and his colleagues in 2019
[99]. This technique can perform insertions, deletions, and all base conversions without requiring a donor template and the production of DSBs. The prime editing system is a combined work using the Cas9 nickase fusion protein, engineered reverse transcriptase enzyme, and pegRNA. This programmable pegRNA is designed to carry the information about the binding sites and replace targeted DNA nucleotides with the desired genetic information
[97]. The main objective was to increase the efficiency of genome editing. For this purpose, three main developments were achieved, including prime editor 1, prime editor 2, and prime editor 3. In plants, prime editing has been successfully employed in wheat, rice, and maize
[100]. More research is needed in plants to make this technology capable of being used for many nucleotide insertions or deletions without creating DSBs. However, for a small number of nucleotide insertions and deletions, it is considered more efficient than the CRISPR/Cas9 gene editing tool
[97]. The advancement in prime editing has developed an improved system called engineered plant prime editor (ePPE). The efficiency of pegRNA has been enhanced by combining it with ePPE. Recent research on ePPE has reported the development of rice plants tolerant to herbicides such as sulfonylurea and imidazolinone
[100].