Tree Detection and Crown Delineation using UAV-SfM Data: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Subjects: Remote Sensing
Contributor: , , , , , , , , , ,

Accurate detection and delineation of individual trees and their crowns in dense forest environments are essential for forest management and ecological applications. This research explores the potential of combining leaf-off and leaf-on structure from motion (SfM) data products from unoccupied aerial vehicles (UAVs) equipped with RGB cameras. 

  • unoccupied aerial vehicle (UAV)
  • structure from motion (SfM)
  • leaf-off
  • leaf-on
  • deciduous forest
  • individual tree crown delineation (ITCD)
  • tree detection

1. UAV Imagery Processing Using Structure from Motion

One of the great advantages of UAV data is their ability to generate 3D information, such as 3D point clouds, from 2D drone imagery applying photogrammetric processing steps, commonly known as structure from motion (SfM) [16]. During data acquisition, highly overlapping images are captured, providing different perspectives on the same ground spots. Prominent feature points are extracted from each image, and features corresponding to the same 3D point are matched in the overlapping regions of different images. Aerial triangulation, such as bundle adjustment, is then applied to define camera positions and orientations, as well as to obtain 3D geometry, creating tracks from a set of matched features. Based on these estimated camera and image positions, densification algorithms (dense stereo matching) can be used to generate dense 3D point clouds. From the point cloud, a digital surface model (DSM) can be derived and, by projecting the single images using the DSM, an orthomosaic can be generated [16,17]. These data products can be further used to extract specific forest parameters. The UAV-SfM approach has the potential to provide both geometric and spectral datasets, serving as input data for various forest parameter extraction algorithms. While methods integrating geometrical point and spectral image data are increasingly used in the field of forestry, most studies rely on LiDAR data rather than on UAV-SfM point clouds [18,19].

2. UAV-Data-Based Products for Tree Crown Delineation

In order to analyze tree parameters on an individual tree level within a forest stand, it is necessary to first segment single trees. This involves delineating the projected tree crown, which can be identified in the orthomosaic or height models, as separate objects. This individual tree crown delineation (ITCD) serves as the foundation for various subsequent analysis steps, including tree species classification, environmental and forest monitoring at the tree level, and the extraction of individual tree parameters directly from remote sensing data [20,21,22,23]. Over the past few decades, numerous ITCD methods have developed, based on generalized characteristics of trees or forests. Most of these methods can be applied to either 2D raster such as orthomosaic, 2.5D raster such as canopy height models (CHMs), or 3D data products such as point clouds. However, there are certain methods that specifically rely on 3D data and cannot be used when only 2D data are available. These methods often originate from the field of laser scanning and are increasingly being tested for point clouds derived from SfM as well [24].
Most methods typically assume a similar—hemispherical to conical—tree shape, with one tree top located at the center of the crown. Due to its exposed position, the top of the tree receives the highest solar radiation, resulting in the highest intensity and brightness values [20,22]. Algorithms are expected to yield better results for forests with a sparse canopy, lower species diversity, and similar age structure. These characteristics are more commonly found in managed forests and in coniferous, savannas, or tundra forest systems [22].
The analysis is often divided into two parts: tree detection, which involves identifying the position of a tree trunk, and delineating the (entire) tree crown [20]. In the following, some of the more commonly used methods for tree detection and delineation will be presented.
Local maxima and region growing: This method builds upon the previously mentioned canopy characteristics. Initially, individual trees are detected by identifying local maxima, which ideally represent tree midpoints. These maxima can be based on both CHM values and brightness values [20,22]. Difficulties may arise when defining the search radius for local maxima, which is derived from pixel size and average crown diameter and due to the fact that, in reality, tree crowns are not symmetrically aligned around a central point. Smoothing filters can reduce unwanted noise within the maxima [23,25,26,27]. Wulder et al. [28] propose using varying local maxima search radii, each based on the semivariance of the pixel.
Starting from these initial seed points, neighboring pixels or objects that exhibit similarity are added to the crown objects until a termination criterion is met, indicating that the crown has been delineated. This process is known as region growing [20].
Valley-following approach: The valley-following approach, initially introduced by Gougeon [29], consists of two parts. The first part involves classifying the areas between the individual tree canopies, while the second part utilizes a rule-based method to refine these classified areas [29]. These intermediate areas are referred to as valleys. They are characterized by higher shading, resulting in lower intensity and brightness values compared to the surrounding areas or represent local minima values [25,30]. According to Ke and Quackenbush [20] and Workie [23], relying solely on this shade-following approach often leads to incomplete separation of individual trees because, depending on tree density, not all crowns are adequately separated from each other by “valleys”.
Watershed segmentation: The watershed segmentation, first described by Beucher and Lantuéjoul [31], is a form of image processing segmentation. It also draws upon the topological analogy of the canopy described earlier. In this method, the values of a gray-scale layer are inverted and “flooded”, starting from local minima (tree tops). The resulting individual watersheds are separated from each other by dams, creating distinct segments [21,22,32]. Markers can be used in this process, representing the local minima from which the “flooding” originates, ideally representing tree tops. The resulting segments are supposed to delimit the individual tree tops [23]. According to Derivaux et al. [32], a common challenge in this method is over-segmentation, which can be mitigated through targeted marker placement, application of smoothing filters, or through the combined use with region-merging techniques and other methods.
Template matching: Template matching is a method that can be employed to detect individual tree crowns when the tree crowns exhibit similar shapes and comparable spectral values. Templates are created based on a gray value layer, representing patterns of typical tree shapes and values, mostly averaged. Both radiometric and geometric properties of the crown are utilized, and different viewing angles can be considered. The template is compared to all possible tree points, with high correlation values above a defined threshold representing individual trees [21,22,25,33].
Deep learning methods: Deep learning methods, such as instance segmentation, are increasingly being utilized for the detection and delineation of single tree crowns [34]. Commonly used model architectures are Mask R-CNN [35], artificial neural networks [15], or U-nets [36]. These methods offer advantages such as the ability to use multi-band images as input, instead of relying solely on a single band, and a focus on textural features, which is beneficial as adjacent trees might have very similar spectral properties. However, a disadvantage is the need for training data, which are typically obtained through time-consuming manual crown delineation and/or field work.
Point cloud-based methods: Terrestrial/airborne laser scanning and UAV-SfM provide 3D point cloud data that can be directly used for the single crown delineation. Most studies focusing on 3D methods utilize LiDAR data, as UAV-SfM does not penetrate dense crown structures well, especially during leaf-on season. As a result, UAV-SfM leaf-on point clouds tend to be less dense for lower canopy and forest floor areas. Various point-cloud-based approaches have been applied for ITCD, including K-means clustering [37], template matching [38], voxel space approaches [39], and mean-shift segmentation [40]. Deep learning models are trained using reference point clouds and have been used to segment single trees [39], employing frameworks such as PointNet [41].
Several recent studies applied one or multiple of the above-mentioned methods to detect tree positions and perform crown delineation using UAV-based imagery data products. A selection of relevant studies is presented in Table 1.
Table 1. Selection of recent studies on tree detection and crown delineation in forest ecosystems using UAV optical data.
As emphasized by several of these studies, detection and delineation of tree crowns in complex forest structures with dense canopy closure and overlapping tree crowns remains challenging [35,51]. Some studies focus on ITCD in forest with low to moderate canopy closure or open forest stands [42,48,53], as well as on forest plantations characterized by a more regular tree spacing and structure [44,46]. These types of forest tend to facilitate the detection of crown boundaries and the number of trees, as most algorithms perform better in homogeneous forest stands with lower canopy closure [19].

This entry is adapted from the peer-reviewed paper 10.3390/rs15184366

This entry is offline, you can click here to edit this entry!
Video Production Service