Applications of Artificial Intelligence in Smart-Crop Breeding: Comparison
Please note this is a comparison between Version 1 by Hafeez Khan and Version 2 by Jessie Wu.

Artificial intelligence (AI) has emerged as a revolutionary field, providing a great opportunity in shaping modern crop breeding, and is extensively used indoors for plant science. Advances in crop phenomics, enviromics, together with the other “omics” approaches are paving ways for elucidating the detailed complex biological mechanisms that motivate crop functions in response to environmental trepidations. These “omics” approaches have provided plant researchers with precise tools to evaluate the important agronomic traits for larger-sized germplasm at a reduced time interval in the early growth stages. The big data and the complex relationships within impede the understanding of the complex mechanisms behind genes driving the agronomic-trait formations. AI brings huge computational power and many new tools and strategies for future breeding. 

  • artificial intelligence (AI)
  • crop breeding
  • genomics

1. Artificial Intelligence Technologies Benefiting Crop Breeding

Artificial intelligence uses computers and technology to simulate the human mind’s problem-solving and decision-making skills [1][11]. Artificial intelligence (AI), often known as machine intelligence, is an area of computer science that focuses on developing and managing technology, which can learn to make decisions and carry out activities independently without the need for human effort [2][12]. AI is a broad term that encompasses a wide range of technologies; it is a catch-all word for any software or hardware component that helps with machine learning, computer vision, natural language comprehension, and natural language processing (NLP) [3][13]. Traditional complementary metal-oxide-semiconductor (CMOS) hardware and the same fundamental computational processes that drive traditional software are used in today’s AI [4][14]. AI is the most rapidly emerging technology in computer science in today’s digital world, and it creates intelligent computers that replicate the intellect of the human mind [5][15]. For instance, the deep neural network (DNN), artificial neural network (ANN), random forest (RF), and support vector machine (SVM) are a few examples of machine-learning algorithms, as well as advanced hi-tech equipment such as the internet of things (IoT) [6][16]. AI is a fascinating hi-tech system that provides an endless opportunity as far as its agricultural applications are considered; hence, this opens up new frontiers for digital breeding [7][17]. Future AI generations are projected to inspire new sorts of brain-inspired circuits and architectures capable of making data-driven judgments faster and more precisely than humans can [8][18]. Furthermore, artificial intelligence, big data, machine learning, and data analytics are all terms that appear often in current academic and corporate writings that deal with data [9][19].
Big data, machine learning, and AI are some of the terms used to characterize modern computer processes [10][20]. Big data is concerned with the use of huge data of diverse types and complex structures that cannot be handled well when analyzed through classical approaches [11][21]. In this context, the AI trains a computer to perform jobs that are beyond human efforts, especially by considering the time and labor involved, and which are typically involved in decision-making in a variety of situations [12][22]. Machine learning (ML) is a branch of AI in which computers discover relationships from massive training datasets. For environment and weather applications, a simple definition is: firstly, big data involves the collection of meteorological or Earth System-related measurements, as well as high spatial and temporal resolution Earth System model (ESM) outputs for analysis; secondly, ML is the refining or discovery of new linkages between locations, times, and quantities in the datasets (e.g., where sea surface temperature features aid the weather prediction for months over land regions); thirdly, AI is a means of providing automatic warnings and guidance to society in the event of oncoming weather extremes, based on the links discovered by machine learning [13][23]. The current ease for application of ML methods due to improved computing capabilities is aided in part by the unique usage of computer graphics processing units (GPUs), with GPU speed improving at a quicker rate than ordinary central processing units [14][24]. This is an innovative use of computer memory to make calculations both more efficient and considerably closer to the data storage location [13][23]. The main emphasis of employing AI in breeding is that it complements the work of the breeder by guaranteeing continuous farm monitoring. Indeed, with the automation of farms and the generalization of data, breeders may dedicate more time to higher-value jobs by spending less time in their buildings. AI saves time in data identification and processing, which is of considerable benefit. Breeders and technical advisors acquire confidence and reactivity, allowing them to act when it is most appropriate [15][25].
AI technology has been used to accelerate the process of breeding new plant varieties, such as high-throughput genomics and phenomics to advanced breeding [5][16][17][18][15,26,27,28]. Increasingly, ML methodology has been used in genomic prediction, genomic selection, and marker-assisted selection [17][18][27,28]. Many agricultural companies such as Monsanto and John Deere have already invested hundreds of millions of dollars to develop such technologies that can utilize extensive data on soil type, seed variety, and weather to help farmers reduce costs and enhance yields [19][29]. Many of the same data sources, such as weather forecasts and Google Maps, are used to fuel both of their businesses. In addition, they may access farm equipment data that are wirelessly sent to the cloud [20][30]. As part of a precision-farming experiment in Romania, companies like Nippon Electric Company, Limited (NEC; headquartered in Minato, Tokyo, Japan) and Dacom (headquarter in Santa Clara, USA) employed environmental sensors and huge data analytics tools to increase yields. The use of current technologies and information systems enhances the overall productivity of agriculture [21][31]. Due to the agricultural data sets’ complexity, novel architecture and frameworks, algorithms, as well as the analytics face several obstacles in extracting the value and hidden information from this data [22][32]. The recent research on AI tools, including ML, deep learning, and predictive analysis intended toward increasing the planning, learning, reasoning, thinking, and action-taking abilities [23][33]. Plant Breeders are developing systems to aid in a better understanding of plant behavior under a variety of climatic situations [24][34]. Summit, the world’s most powerful supercomputer, was recently unveiled with the potential to hold 27,000 GPUs, paving the way for a bright future. AI has the potential to be a game-changer in the near future for bringing an agricultural revolution and global food security [25][35].

2. Exploring the Potential of  Artificial IntelligenceI in Gene Function Analysis

The rapid development of high-throughput technologies in biological sciences has resulted in the generation of massive data in recent decades. Disciplines that attempt to collect and analyze enormous volumes of biological data are often referred to as “omics”, which is used to indicate the total quantity of DNA contained in each cell of an organism, with an additional flavour of openness to big challenges [26][47]. “Omics” data has become too large and complicated to be analyzed visually or by using statistical correlations. This has incited the use of so-called Machine Intelligence or AI which manages large amounts of data that are insurmountable for human minds, while extracting information that goes beyond our current understanding of the system under investigation and, most importantly, improving automatically based on the training data [27][48]. AI is already being used extensively in plant genomics and also possesses more future applications for in-depth genome exploration. A number of ML tools and algorithms are available for different kinds of bioinformatics analysis, such as protein-coding gene identification, cis-regulatory element identification, gene expression, subcellular location, protein-protein interaction, gene ontology, metabolic pathways, phenotypes, and genomic prediction (as reviewed by Mahood et al. (2020) [28][49]). In the not-too-distant future, AI is likely to be used to address a variety of plant science genomics concerns.AI algorithms might potentially be used to address comparative genomic investigations or information transfer from a model plant to a crop of interest [29][50]. DeepBind [30][51] and DeepSEA [31][52] are two models that have been created in recent years to predict and analyze genetic features [16][26]. Various sorts of expressions or sequencing data analysis can be thought of, with the goal of predicting gene functions or the differential effects of gene expression on a trait [32][53]. Although a significant amount of genomic data was produced as a result of the fruitful breakthroughs of high-throughput sequencing technology, the enormous amount of data generated creates a huge problem for storage and examination of the data [16][26]. The AI technology of bioinformatics enables the measurement of simultaneous expressions of a large number of genes, or even each and every gene that is included in the genome under a wide range of situations [33][34][54,55]. All of this combines to give biologists a more “relevant” representation of their data and the ability to integrate it, which enables them to examine their genomic data, test and confirm their assumptions throughout the experimental cycle, and ultimately improve their research [35][36][56,57].

3. Linking of Crop Genome to Phenome with  Artificial Intelligence

Currently, modern breeding approaches are focused on linking the genotype with the crop phenotype accurately and precisely. In advanced breeding, linking the whole of the genome information to high-throughput phenotypes remains a massive challenge, and is impeding the optimal application of field phenotyping and omics [5][15]. Germplasm collection and mapping populations can efficiently differentiate the phenomics and genomics data through AI. Crop diversity, single nucleotide polymorphisms (SNPs) detection and selection, quantitative trait loci (QTL) analysis, genome-wide association study (GWAS) analysis, and genomic selection and sequences generate a large amount of data; AI can evaluate and link the phenomics and genomics data from these big data to improve the breeding approaches. AI related to a computation and training model can predict the gene functional analysis and high-throughput crop phenotyping and also predict the performance of yield and traits of the crop [26][29][37][38][39][46,47,50,58,59]. Therefore, the integration of AI with phenomics and genomics tools can allow for rapid gene identification associated with the crop phenotypes that eventually accelerate crop improvement programs. In Figure 1, rwesearchers summarize how to apply AI technology to link high-throughput genomics and phenomics, which can result in the production of better breeding strategies.
Figure 1. Artificial Intelligence used as a powerful tool for the prediction of high-throughput crop phenotyping and gene functional analysis in modern crop breeding. The high-throughput phenotypic and genotypic data were collected from large crop germplasm and breeding populations. The massive comprehensive database could integrate various resources with AI technology, such as phenotypic diversity of crops, SNPs polymorphisms, QTL analysis, GWAS analysis, genomics selection, and genome sequence. AI technologies are applied to predict the crop phenotype with whole genome prediction, the novel breeding strategies are produced through AI related to computation and training models.
Research on crop genomics is not only understanding the molecular mechanisms of phenotypes but also using technical data and bioinformatics techniques to analyse and understand the molecular mechanisms behind phenotypes [40][60]. To date, AI is a fascinating approach to bringing out these tasks inevitably [41][61]. AI approaches provide the platform to analyze huge, various, and useless datasets such as the generation of genome sequencing/photo imaging over conventional analytical strategies [5][42][15,62]. Recently, the AI approach has been explicitly employed in varied research fields of phenomics and genomics, such as: analysing genome assembly and genome-specific algorithms [16][26]; broad-range data analysis to mitigate multiplex biological complications in metabolomics, proteomics, genomics, transcriptomics, as well as systematic biology [42][43][62,63]; interpretation of gene expression cascades [44][45][64,65]; identification of significant SNPs in polyploid plants [46][66]; high-throughput crop stress phenotyping [47][48][41,67]. Scientists have employed AI and its developed models to modulate the flow of information from generic DNA to genetic-based phenotypes, to investigate the potential variants in natural populations [28][49]. More specifically, for breeders, AI will assist the further investigation of genetic loci to facilitate the agricultural output by triggering the genome algorithms and allowing high-throughput crop phenotyping in quantitative traits for open-field and controlled environments [28][49][49,68]. Additionally, AI can be cohesively combined with bioinformatics and genome sequencing analysis to interpret various molecular repertories such as transcription factor binding sites [50][69], long non-coding RNAs (lncRNAs) [51][70], microRNA (miRNAs), epistatic modifications, coding genes, targeted polyadenylation sites [52][71], as well as cis-regulatory elements (CREs) [28][53][49,72]. Various crop databases insert a huge amount of heterogeneous-related phenotypic and genotypic data (big data) recently providing insight into potential resources for breeders to untangle novel trait-identified candidate genes [54][73]. Luckily, AI provides a novel benchmark summary for analytical and computational methods for the integrated analysis of such enormous datasets based on the big-data spectrum [28][54][49,73]. In addition, employing AI to conclude the interrelations between candidate genes and CREs is a novel approach for categorizing and identifying previously unknown genes for significant crop improvements [55][74]. Furthermore, AI strategies have more potential for interpretation of the crop yield, variation in climatic assessment, high-throughput crop stress phenotyping, climate temperature, ultraviolent (UV) radiation, wind, and hail [16][54][56][26,73,75]. The role of AI is becoming more and more important in obtaining, analyzing, integrating, and managing genomic and phenomic data to increase agricultural climate resilience [57][58][76,77]. Next generation sequencing (NGS)-based genotyping methods have helped to improve gene-mapping resolution and gene identification and NGS-based genotyping for GWAS analysis has been used in crop improvement [49][68]. For example, in soybean, these kinds of studies have been widely used to identify genetic loci and candidate genes for seed weight [59][78], seed protein and oil contents [60][79], pod dehiscence [61][80], nitrogen fixation [62][81], soybean plant height and primary branches [62][81], agronomic traits [63][82], disease resistance [64][83], and tocopherol concentration [65][84]. Bulk segregant analysis (BSA) and its modified methodologies are currently used in many crops [66][67][68][69][70][71][85,86,87,88,89,90]. The NGS-based BSA is becoming a popular approach to identifying candidate genes for various traits, such as the soybean mosaic virus [72][91], charcoal rot resistance [73][92], flowering time [74][93], phytophthora resistance [75][94], and powdery-mildew resistance [76][95]. Recently, the deep-learning algorithm for BSA (DeepBSA) has been developed for QTL mapping and functional gene cloning in maize [77][96].
Video Production Service