Deep Learning Methods in Plant Taxonomy

Deep Learning Methods in Plant Taxonomy: Comparison

Please note this is a comparison between Version 2 by Sirius Huang and Version 1 by Atta ur Rahman.

Plant taxonomy is the scientific study of the classification and naming of various plant species. It is a branch of biology that aims to categorize and organize the diverse variety of plant life on earth. Traditionally, plant taxonomy has been performed using morphological and anatomical characteristics, such as leaf shape, flower structure, and seed and fruit characters. Artificial intelligence (AI), machine learning, and especially deep learning can also play an instrumental role in plant taxonomy by automating the process of categorizing plant species based on the available features.

clustering
deep learning
plants taxonomy
AI

1. Introduction

Plant taxonomy has been performed using morphological and anatomical characteristics, such as leaf shape, flower structure, and seed and fruit characters. Additionally, with the advent of molecular biology and genetics, it is now possible to use DNA analysis to aid in the classification of plant species ^[1]. The morphological characteristics of the seed play an important role in the identification and classification of the plant. These characteristics are collected and modified as a data matrix and used with a statistical program such as PRIMER to help the scientists to illustrate the relationship as a dendrogram. Recently, clustering algorithms have been used to discover underlying patterns in the image data and to form categories that can be used for image taxonomy. Image clustering is the process of grouping similar images together to simplify and organize large image datasets. Clustering has been extensively studied in the field of machine learning and computer vision. Deep learning has revolutionized the field of image clustering, enabling the more accurate and efficient clustering of large-scale image datasets. The following sections will summarize the recent advancements in image clustering using deep learning. AI deep learning algorithms complement and enhance the traditional methods of plant taxonomy by providing a more efficient and objective approach to categorizing plant species. This can help in the preservation and protection of biodiversity and gaining a better understanding of the relationships between different plant species. Image clustering can be performed using various algorithms such as k-means, hierarchical clustering, and spectral clustering. However, these methods have limitations in terms of accuracy and scalability. Deep learning-based approaches have shown promising results in overcoming these limitations. Deep learning algorithms are capable of automatically learning features from images, which can then be used for clustering. Considering that datasets are an essential part of model building and evaluation, and although the study sample is not representative of all wild plants and cannot generalize it to cultivated plants, they may need to apply to different species.

2. Classification Using Deep Learning

CNNs are a class of neural networks that are particularly well-suited for image and video recognition tasks [8]^[2]. In recent years, researchers have developed numerous variations of CNNs, such as residual networks (ResNets), inception networks, and attention mechanisms. These networks have achieved state-of-the-art results in image classification, object detection, and semantic segmentation tasks. Another technique used a deep learning approach to learn discriminative features from leaf images with classifiers for plant identification [9]^[3]. The authors in [10]^[4] introduced a new convolutional neural network architecture-based model for classifying plant images. The new method used in image classification is the transfer learning technique; three models are visible in [11]^[5], in which the AyurLeaf CNN model is assessed and compared to the AlexNet, Leaf, and fine-tuned AlexNet versions, with an accuracy of 96.76%. The study of plant taxonomy examines the classification of various plant species. The aforementioned study discovered that transfer learning enhances the performance of deep learning models, particularly those that employ deep features and use fine-tuning to produce better performance. The authors in [12]^[6] presented an extension work to [13]^[7] with an adaptive algorithm that relies on deep adaptive CNNs, which are a class of neural networks that are particularly well-suited for image and video recognition tasks. In recent years, researchers have developed numerous variations of CNNs, such as residual networks (ResNets), inception networks, and attention mechanisms. These networks have achieved state-of-the-art results in image classification, object detection, and semantic segmentation tasks. In [14]^[8], D-Leaf, a cutting-edge CNN-based strategy, was presented. The three pretrained CNN models—pretrained AlexNet, fine-tuned AlexNet, and D-Leaf—were applied, respectively. With respect to three publicly accessible datasets—the MalayaKew, Flavia, and Swedish Leaf Datasets—these techniques have an accuracy rate of 90–98%, enhancing the performance of classifying plant species. Similarly, the authors in [15]^[9] presented Inception v3, ResNet50, and DenseNet201, used to further increase a dataset’s diversity; they used a variety of augmentation operations on the dataset, which contained 256,288 samples, and a noisy set, with 1,432,162 samples. Currently, a pretrained AlexNet that has been improved is ranked fourth [16]^[10]. To provide thorough empirical guidance indicating that residual networks are easier to refine and can attain precision by significantly increasing depth with relatively lower complexity as evident in the ECG classification study [17]^[11]. An ensemble of these residual nets managed a 3.57% error rate on the ImageNet test range. An inception convolutional neural network (CNN) model-based network was used to pretrain the model with ImageNet and refine it with the PlantCLEF database. They combined the results of five CNNs after they had been tuned using randomly chosen database segments. On the other hand, the optimization of the hyperparameters was not finished [18]^[12]. The authors in [19]^[13] presented a design of a multi-input CNN for large-scale flower grading and achieved 89.6% accuracy by using the augmentation technique. The study in [20]^[14] proposed a method of reliably matching between various views of an object or scene using distinctive invariant features that can be extracted from images. An overview of techniques for identifying plant species and extracting features from leaf images was given in [21]^[15]. A similar study, [22]^[16], presented a proposed model which performed better when using validation data when compared to other well-known transfer learning techniques. In [23]^[17], the authors created a data article for a dataset containing examples of pictures of fruit and leaves from healthy citrus trees. Transfer learning is a machine learning technique in which knowledge learned from one task is applied to a related but different task [24]^[18]. Ibrahim et al. in [25]^[19] proposed a novel deep learning approach to fruit identification and its family classification based on a fruit image dataset. In this regard, two different datasets were used individually as well in an augmented form. Several deep learning models were investigated, and it was concluded that the proposed CNN model outperformed the other models with the highest accuracy of 99.82%. In transfer learning, the knowledge gained from a pretrained model is used as a starting point for training a new model, rather than starting from scratch [26,27,28]^[20][21][22].

3. Clustering Using Deep Learning

Clustering is a fundamental unsupervised machine learning technique that involves grouping similar data points together. However, evaluating the quality of clustering results can be challenging due to the absence of ground truth labels. Several clustering evaluation methods have been proposed in the literature to assess the effectiveness of clustering algorithms. One of the commonly used clustering evaluation methods is the silhouette score [29]^[23]. This method measures the degree of similarity between data points within clusters and dissimilarity between data points in different clusters. A higher silhouette score indicates that clustering is more effective. The Rand index and adjusted mutual information [30]^[24] are two evaluation methods that compare the clustering results to a known ground truth clustering. The Rand index measures the similarity between the clustering results and the ground truth, while AMI adjusts for chance agreement. A higher Rand index or AMI indicates better agreement between the clustering results and the ground truth. In conclusion, there are several clustering evaluation methods that can be used to assess the effectiveness of clustering algorithms. However, it is essential to keep in mind that each method has its strengths and weaknesses, and using multiple methods is often necessary to gain a comprehensive understanding of the clustering results. This approach proves advantageous in situations where there is a scarcity of data for the specific task at hand or when the task closely resembles an existing task for which ample data and computational resources are already accessible. Several deep learning algorithms have been proposed for image detection, classification, and clustering, including CNNs, autoencoders, and generative adversarial networks (GANs) [31,32]^[25][26]. CNNs have been widely used for image clustering due to their ability to automatically learn hierarchical features from images. Deep embedded clustering (DEC) is a popular clustering algorithm based on CNNs that uses a two-stage process of unsupervised pretraining followed by clustering. Other CNN-based clustering algorithms include convolutional autoencoder clustering (CAE-C) and deep convolutional autoencoder clustering (DCAE-C) [33]^[27]. Autoencoders are neural networks that learn a compressed representation of the input data. They have been used for image clustering by training an autoencoder to reconstruct input images and using the learned encoder to generate feature vectors for clustering. Clustering using deep autoencoders (CDAs) is an example of an autoencoder-based clustering algorithm [34]^[28]. Based on the literature review, the following can be concluded:

Deep learning is among the potential successful candidates in image feature extraction.
In botanical studies, there is a dire need to investigate transfer learning algorithms for seed taxonomy.
Hierarchical clustering algorithms can be investigated for automated clustering seeds that can set potential applications for classification in the future.
The following are the most commonly used deep learning models for images: DenseNet121, DenseNet201, ResNet50V2, EfficientNetB6, EfficientNetB1, EfficientNetB0, MobileNetV2, EfficientNetB3, VGG16, VGG19, EfficientNetB5, EfficientNetB7, EfficientNetB2, and EfficientNetB4.

References

Takamitsu, Y.; Orita, Y. Effect of glomerular change on the electrolyte reabsorption of the renal tubule in glomerulonephritis (author’s transl). Jpn. J. Nephrol. 1978, 20, 1221–1227.
Lee, J.W.; Yoon, Y.C. Fine-Grained Plant Identification using wide and deep learning model 1. In Proceedings of the 2019 International Conference on Platform Technology and Service (PlatCon), Jeju, Republic of Korea, 28–30 January 2019; pp. 1–5.
Lee, S.H.; Chan, C.S.; Wilkin, P.; Remagnino, P. Deep-plant: Plant identification with convolutional neural networks. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; pp. 452–456.
Gyires-Tóth, B.P.; Osváth, M.; Papp, D.; Szucs, G. Deep learning for plant classification and content-based image retrieval. Cybern. Inf. Technol. 2019, 19, 88–100.
Dileep, M.R.; Pournami, P.N. AyurLeaf: A Deep Learning Approach for Classification of Medicinal Plants. In Proceedings of the TENCON 2019—2019 IEEE Region 10 Conference (TENCON), Kochi, India, 17–20 October 2019; pp. 321–325.
Picon, A.; Alvarez-Gila, A.; Seitz, M.; Ortiz-Barredo, A.; Echazarra, J.; Johannes, A. Deep convolutional neural networks for mobile capture device-based crop disease classification in the wild. Comput. Electron. Agric. 2019, 161, 280–290.
Johannes, A.; Picon, A.; Alvarez-Gila, A.; Echazarra, J.; Rodriguez-Vaamonde, S.; Díez Navajas, A.; Ortiz-Barredo, A. Automatic plant disease diagnosis using mobile capture devices, applied on a wheat use case. Comput. Electron. Agric. 2017, 138, 200–209.
Tan, J.W.; Chang, S.W.; Abdul-Kareem, S.; Yap, H.J.; Yong, K.T. Deep Learning for Plant Species Classification Using Leaf Vein Morphometric. IEEE/ACM Trans. Comput. Biol. Bioinforma. 2020, 17, 82–90.
Haupt, J.; Kahl, S.; Kowerko, D.; Eibl, M. Large-scale plant classification using deep convolutional neural networks. CEUR Workshop Proc. 2018, 2125, 1–7.
Reyes, A.K.; Caicedo, J.C.; Camargo, J.E.; Nari, U.A. Fine-tuning Deep Convolutional Networks for Plant Recognition. CLEF 2015, 1391, 467–475.
Gajendran, M.K.; Khan, M.Z.; Khattak, M.A.K. ECG Classification using Deep Transfer Learning. In Proceedings of the 2021 4th International Conference on Information and Computer Technologies (ICICT), Kahului, HI, USA, 11–14 March 2021; pp. 1–5.
Choi, S. Plant identification with deep convolutional neural network: SNUMedinfo at LifeCLEF plant identification task 2015. CEUR Workshop Proc. 2015, 1391, 2–5.
Sun, Y.; Zhu, L.; Wang, G.; Zhao, F. Multi-Input Convolutional Neural Network for Flower Grading. J. Electr. Comput. Eng. 2017, 2017, 9240407.
Low, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. Available online: https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf (accessed on 6 January 2023).
Xiao, J.; Wang, J.; Cao, S.; Li, B. Application of a Novel and Improved VGG-19 Network in the Detection of Workers Wearing Masks Application of a Novel and Improved VGG-19 Network in the Detection of Workers Wearing Masks. J. Phys. Conf. Ser. 2020, 1518, 012041.
Zhang, S.; Huang, W.; Huang, Y.; Zhang, C. Neurocomputing Plant species recognition methods using leaf image: Overview. Neurocomputing 2020, 408, 246–272.
Ullah, M.I.; Attique, M.; Sharif, M.; Ahmad, S.; Bukhari, C. Data in brief A citrus fruits and leaves dataset for detection and classi fi cation of citrus diseases through machine learning. Data Brief 2019, 26, 104340.
Wang, X.; Yang, Y.; Guo, Z.; Zhou, Z.; Liu, Y.; Pang, Q.; Du, S. Real-World Image Super Resolution via Unsupervised Bi-directional Cycle Domain Transfer Learning based Generative Adversarial Network. arXiv 2022, arXiv:2211.10563.
Ibrahim, N.M.; Gabr, D.G.I.; Rahman, A.-U.; Dash, S.; Nayyar, A. A deep learning approach to intelligent fruit identification and family classification. Multimed. Tools Appl. 2022, 81, 27783–27798.
Khan, T.A.; Fatima, A.; Shahzad, T.; Rahman, A.U.; Alissa, K.; Ghazal, T.M.; Al-Sakhnini, M.M.; Abbas, S.; Khan, M.A.; Ahmed, A. Secure IoMT for Disease Prediction Empowered with Transfer Learning in Healthcare 5.0, the Concept and Case Study. IEEE Access 2023, 11, 39418–39430.
Asif, R.N.; Abbas, S.; Khan, M.A.; Rahman, A.U.; Sultan, K.; Mahmud, M.; Mosavi, A. Development and Validation of Embedded Device for Electrocardiogram Arrhythmia Empowered with Transfer Learning. Comput. Intell. Neurosci. 2022, 2022, 5054641.
Nasir, M.U.; Zubair, M.; Ghazal, T.M.; Khan, M.F.; Ahmad, M.; Rahman, A.-u.; Hamadi, H.A.; Khan, M.A.; Mansoor, W. Kidney Cancer Prediction Empowered with Blockchain Security Using Transfer Learning. Sensors 2022, 22, 7483.
Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987, 20, 53–65.
Vinh, N.X.; Epps, J.; Bailey, J. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 2010, 11, 2837–2854.
Xie, J.; Girshick, R.; Farhadi, A. Unsupervised deep embedding for clustering analysis. In Proceedings of the 33rd International Conference on International Conference on Machine Learning ICML, New York, NY, USA, 19–24 June 2016; Volume 1, pp. 740–749.
Ahmed, M.I.B.; Zaghdoud, R.; Ahmed, M.S.; Sendi, R.; Alsharif, S.; Alabdulkarim, J.; Saad, B.A.A.; Alsabt, R.; Rahman, A.; Krishnasamy, G. A Real-Time Computer Vision Based Approach to Detection and Classification of Traffic Incidents. Big Data Cogn. Comput. 2023, 7, 22.
Guo, X.; Liu, X.; Zhu, E.; Yin, J. Deep clustering with convolutional autoencoders. In Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 582–590.
Yang, J.; Parikh, D.; Batra, D. Joint unsupervised learning of deep representations and image clusters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5147–5156.