The impact of wood identification extends beyond illegal trading and ecological issues. Wood identification is paramount for the timber industry, civil and structural engineering, criminology, archaeology, art history, ethnography, and conservation and restoration, and many other disciplines.
1. Introduction
Despite the multiple wood identification methods now available, the varied results, costs, accessibility, deployment time and limiting factors hinder their applicability to real-world identification. Herein, an overview of the changes is presented that have occurred in wood identification methods and a review of computer vision-based wood identification, which is currently one of the fastest-developing research areas in artificial intelligence (AI) with very promising results and high identification accuracy. In this technique, visual data are processed from any given image to extract the relevant features in order to make a decision.
The digital systems described above are the foundation of the systems which, despite their limitations, are currently used to identify wood, mostly based on computer vision technology. They are applicable to several fields of research and industry, including neurobiology, autonomous vehicles, and facial recognition. Computer vision systems process visual data from any given image or video to extract the required and relevant features to make a decision
[1].
This image recognition ability, also known as image classification, is one of the most important research areas in AI and is most frequently based on supervised learning. Herein, the network is required to create a model that learns from labelled images to determine classification rules, then it classifies the input data based on these same rules (generally used for image classification). In the case of unsupervised learning, it is the model that obtains unknown information through unlabelled data (generally used for image clustering)
[2].
Machine learning can also decide what to do without human assistance from the data recognised by computer vision (input data), using predesigned algorithms
[3] [4]. This removes the need to teach the model the necessary features or procedures for wood identification
[5].
Computer vision technology is very appealing to many researchers because of its verifiable potential for field application
[6] and proven ability to recognise and quantify wood structure variations that are not easily discernible using strictly “human” analysis. It is also an affordable resource
[7] and, therefore, scalable. However, for the software to correctly interpret the specific architecture structure of the samples analysed to such a high level of precision, reference material must be constantly entered into the image database so that it can recognise natural variations in wood structure
[8].
Computer vision-based wood identification is the real-world application of combining two types of software with different approaches within AI
[9] [10].
2. Machine Learning
Machine learning operates primarily as software that recognises patterns from input images that are processed to define a descriptive structure to which the unknown image will be referenced
[11]. This involves various stages, as follows.
3. Image Acquisition
The most frequently used types of image are macroscopic images (obtained without magnification using a normal digital camera)
[12] [13], stereograms (stereoscopic images obtained with hand lens magnification, ca. 10×)
[14] [15], micrographs (optical microscopic images)
[16] [17], SEM images (up to 10,000×)
[18], and X-ray computed tomography (CT) images
[19] [20].
Light control and uniformity are significant issues in image processing
[21]. They include techniques that are used to filter and normalise image brightness
[22] [23].
4. Image Datasets
Image dataset construction or availability is one of the most significant factors among the multiple issues that can affect the performance of computer vision-based wood identification systems.
The more extensive the dataset is, the more naturally occurring biological variations within a species will be accessed and learned by the model. However, because constructing a dataset of wood samples is such a difficult and time-consuming task, most studies use wood collections for references
[24] [25].
This limitation is countered to some extent by initiatives such as ImageNet
[26]. Aiming to advance computer vision and deep learning research, the ImageNet dataset was made freely available to researchers worldwide. It contains 14.2 million images across more than 20,000 classes. A similar process is underway with herbaria digitalisation
[27] [28]. However, despite the efforts made
[29] [21], the lack of free access to worldwide wood image datasets continues to be the main constraint for computer vision-based wood identification
[30].
Table 1 shows the main currently available datasets that have useful data for computer vision-based wood identification research.
Table 1. Wood image datasets available for computer vision-based wood identification research, adapted from
[31].
4.1. Image Processing
Machine learning comprises two independent procedures: feature processing, also known as extraction (extraction of relevant features from input images), and classification (learning extracted features and querying image classification). There is, however, a previous step to image processing.
Pre-processing aims to convert the image into data that a specific algorithm can use to extract the required features, thus reducing computational complexity and facilitating subsequent processing
[41]. The techniques used for this include greyscale conversion and image cropping
[42] [43], filtering
[22] [44], image sharpening
[25] [45] and denoising
[46] [47].
5. Deep Learning
Deep learning is among the most notable and promising of the many branches of machine learning research.
As a neural network that attempts to simulate the function, structure and behaviour of the human brain (Figure 1), it has the capacity to process and “learn” large amounts of data [48] [49].
Figure 1. General pipeline of deep learning models for image classification (based on [50] [51]).
5.1 Artificial Neural Networks (ANN)
Artificial neural networks are not only one of the main investigation methods, but also constitute the foundation of deep learning [31]. These mathematical structures inspired by biological neural networks are a form of supervised or unsupervised learning that show high ability to learn from examples given to them and extrapolate the information when applied to future non-identified samples. This ability to reproduce, model and “learn” nonlinear processes has given ANNs widespread applications in multiple disciplines [43] [52].
5.2. Convolutional Neural Networks (CNN)
Convolutional neural networks are one of the most significant applications of ANNs. In the AI context, a CNN is a class of feedforward ANN that has been successfully applied to digital image processing analysis.
A CNN processes images more effectively by applying filtering techniques to ANNs [53]. This is a powerful and accurate way of solving classification problems, and CNNs are mainly credited for their role in image analysis, recognition, and classification. The architecture of a CNN typically has multiple layers between input and output: three convolutional layers, a pooling layer and a fully connected layer. These layers process different tasks during the image’s course. As the images progress through the distinct layers, features such as edges, colours and shapes are extracted and interpreted. These features are then learned and classified by the deep neural network, resulting ultimately in the network’s ability to identify a specific object [54] [55]. Other advantages are the capacity of automatically recognise important features without human supervision.
5.3. Generative Adversarial Networks (GANs)
Within deep learning, GANs [56] are described as neural networks that can learn to generate realistic samples from the data on which they were trained.
They use a neural network as a generator that takes a random distribution of data as input and learns to map that information to output the desired distribution of data. A second neural network, known as a discriminator (a binary classifier), will use the input and output images to determine the probability of the image originating as a training image (real) or on the generator (fake), thus assessing the most likely class to which the output image belongs [57].
Generative adversarial networks can produce highly realistic images using CNNs in an unsupervised manner [58].