Green space is increasingly recognized as an important component of the urban environment yet inventorying urban vegetation is a costly and time-consuming process. Various types of remote sensing can be used in the automated mapping of urban vegetation.
1. Introduction
The presence of vegetation in an urban ecosystem has a multitude of beneficial effects. The proximity of green space has been linked to improved physical and psychological wellbeing of city dwellers [
1]. Urban green also provides a whole range of environmental benefits [
2,
3,
4]. The specific services that provide these benefits include, among others, (a) sequestration of carbon through photosynthesis [
4], (b) noise reduction [
5], (c) provision of shade and the attenuation of the urban heat island effect [
6]. The latter is becoming increasingly important due to the ongoing climatic warming [
7,
8].
Services rendered by urban green depend on (a) the vegetation type, (b) structure and (c) local context [
9,
10,
11]. Assessing services rendered by urban green requires a suitable scale of analysis, depending on the service of interest. As an input for studying the urban heat island effect, information on the spatial distribution and density of vegetated areas may be sufficient [
12]. However, the services and disservices of urban green can also be studied at a more detailed level, for different species, as is often done for urban trees. As an example, the absorption of airborne pollutants is much larger for some plant species than for others [
13]. Several species can be linked to ecosystem disservices such as the spread of allergens during the pollination season and the release of volatile organic compounds [
14,
15], which is an important factor to take into account when designing urban green spaces. To facilitate sustainable urban planning, it is important to establish a detailed inventory of urban green to adequately manage and to understand the ecological services rendered by vegetation [
13,
16]. The level of detail of such an inventory, and hence the mapping approach required for its creation, can vary depending on its purpose.
Most larger cities already monitor vegetation through extensive field surveys; however, this only provides information concerning the public green space. Private properties remain largely unmonitored, despite their significant contribution to ecosystem services [
17]. Monitoring urban vegetation is also costly and time-consuming [
18]—hence the increasing interest in automated mapping techniques. The use of remote sensing imagery to distinguish different land cover and land use types in an urban environment is a mature sub-discipline of remote sensing research. However, traditionally, land cover mapping in an urban context often concerned only two vegetation classes: high vegetation and low vegetation [
19,
20]. Nevertheless, the use of remote sensing imagery for the detailed mapping of urban vegetation is gaining interest from different public and private actors. The development of this branch of remote sensing research has been made possible by an improvement in remote sensing technology. More specifically, it is now possible to capture spatial data with a higher temporal, spectral and spatial resolution than before. Additionally, the increase in available computational power has enabled researchers to process the available data faster and in ways that were previously not feasible.
A whole body of research already exists concerning the mapping of tree species and crop types in a rural environment. Nonetheless, research on the mapping of urban green has its own challenges that are related to the spatial and spectral heterogeneity of the urban landscape and the complex three-dimensional structure of urban areas, resulting in large shadowed areas, multiple scattering and issues of geometric mismatch in combining different data sources [
21,
22,
23].
2. Vegetation Typologies
Broadly, a distinction can be made between two approaches taken by scholars when mapping vegetation in an urban environment: vegetation types are either defined based on functionality or on taxonomic classes.
2.1. Functional Vegetation Types
Many studies focus on the mapping of urban land use/land cover, yet, in the majority of these works, the focus does not lie on the mapping of urban vegetation, but on characterizing built-up areas with different functionalities (residential, commercial, etc.) or morphology [
33,
34,
35]. In these studies, vegetation is usually represented by only one or two classes (e.g., high versus low vegetation, woody versus herbaceous). A number of the studies reviewed, though, define vegetation classes from a functional perspective, whereby the nature of the vegetation classes and the level of thematic detail depends on the envisioned use of the map. In these studies, we see an increasing focus on the role of different types of vegetation as providers of ecosystem services [
36]. Generally, four types of services are recognized: (a) provisioning, (b) regulating, (c) supporting and (d) cultural services [
37]. Various frameworks have been proposed for defining urban vegetation classes based on the kinds of ecosystem services they provide.
Mathieu et al. [
36] focus on supporting/habitat services, providing a living space for organisms. The classes in their study on mapping vegetation communities in Dunedin City, New Zealand, were based on a habitat classification scheme specifically designed for the urban environment, where mixed exotic–indigenous vegetation occurs more than in a rural environment [
38]. The first level in their classification defines four structural habitat categories (trees, scrub, shrubs and grassland), which, at the second level of the hierarchy, are further subdivided into a total of 15 classes based on (a) spatial arrangement (tree stands, scattered trees, isolated groups of trees), (b) the presence of native or non-native species, or a mix of both (for trees, scrub, shrubs), and (c) the type of management (for grassland). Using object-based classification of Ikonos imagery, they obtained a relatively low classification accuracy of 64% for these classes, mainly caused by confusion between scrub habitats, shrubland and vineland, as well as between parks and woodland.
Bartesaghi-Koc et al. [
12] focus on the regulating services of green infrastructure in the greater metropolitan area of Sydney. In their study, they propose a general green infrastructure typology to support climate studies. Inspiration for this typology was drawn from existing standard land cover classification schemes, such as LULC [
39], LCZ [
40], HERCULES [
41] and UVST [
42]. Such a typology is valuable given the effectiveness of green infrastructure in mitigating the intensity of heatwaves and in decreasing urban temperatures overall [
43]. In their scheme, the differentiating factor is not only the vegetation life form but also the structural characteristics of the vegetated area (e.g., vegetation density). The distinction between different classes in their scheme is based on three dimensions: (a) height of the vegetation (or life form), (b) structural characteristics and (c) composition of the ground surface. Using thermal infrared, hyperspectral, LiDAR and cadastral data, they reached an overall accuracy of 76% in mapping the classes of their proposed scheme.
Kopecká et al. [
44] and Degerickx et al. [
30] took all four ecosystem services into consideration in defining the vegetation types in their studies and both ended up with a total of 15 classes. Both tried to use expert knowledge to define a fixed set of categories. Kopecká et al. [
44] do not take the physiological or structural characteristics of the vegetation explicitly into consideration but rather make a distinction between urban vegetation types based on the land use in which the vegetation is embedded. Degerickx et al. [
30] focus on the characteristics of urban green elements by initially distinguishing three main classes based on the height of the vegetation: trees, shrubs and herbaceous plants. Each of these classes is then further divided into subclasses based on spatial arrangement (e.g., tree/scrub patches, rows of trees, hedges, etc.), vegetation composition (grass, flowers, crops, etc.) and type of management (e.g., plantations, lawns, meadows, vegetable gardens, extensive green roofs, etc.).
The automated part of the classification procedure by Kopecká et al. [
44] only entailed two vegetation classes (tree cover and non-woody vegetation). Because the authors assumed the spectral separability of the detailed classes in their scheme to be too low, further distinction between classes was made based on visual interpretation of the vegetated areas. Degerickx et al. [
30] performed the mapping in a semi-automated way. Making use of high-resolution airborne hyperspectral imagery (APEX sensor) and LiDAR data and applying an object-oriented classification approach followed by a rule-based post classification process aimed at improving the quality of the classification, they achieved an overall accuracy of 81% on the 15 classes defined.
Various studies focusing on differentiating between functionally relevant vegetation types are specifically aimed at defining the degree of thematic detail that can be achieved by analyzing the spatial/spectral separability of the classes during the image segmentation and/or image classification phase (e.g., [
45,
46,
47,
48,
49,
50]), whereby it is common to use a hierarchical classification approach (e.g., [
51]).
2.2. Taxonomic Classes
In rural areas, classification at the species level has been thoroughly researched in the context of automated crop classification and forestry research. However, the urban environment poses specific challenges. As mentioned before, (a) the spectral/spatial heterogeneity caused by a large variety in background material, (b) the disturbing effects of shadow casting and (c) the different spatial arrangements in which vegetation can occur make the mapping of urban vegetation quite challenging [
52,
53,
54]. Furthermore, the availability of reference data for training image classifiers for mapping at species level is often limited due to a lack of effort by local public authorities in maintaining urban green inventories. On top of this, a large part of the vegetation in urban areas is found on private property, for which relatively little information on vegetation cover is known.
In urban environments, mapping up to the species level has almost exclusively been done for tree species. One of the (obvious) reasons is that tree crowns are large enough to be recognized on high and very high spatial resolution imagery that is available nowadays (for an overview of sensors used in the studies included in this review, see
Table 2). Additionally, the difference in spectral signature and 3D structure between tree species is sufficient to expect acceptable accuracies for the mapping of urban trees [
55,
56].
Various authors have attempted to classify urban trees at species level, although it is difficult to compare these studies due to the high variety in the tree species that are mapped (e.g., [
53,
54,
57]). This can be attributed to the fact that studies on this topic are often of a local nature and linked to applied research (e.g., related to tree species inventorying or ecosystem service assessment in a specific study area). Researchers will generally not consider applying their proposed methodology on a benchmark dataset.
A distinction must be made between the identification of trees that are part of a denser canopy (e.g., [
58,
59]) or the identification of single standing trees (e.g., street trees). Often, both will be included in the same study when the area entails both urban parks and built-up areas. However, different approaches may be required to obtain optimal mapping results in each case. In an urban forest setting, trees will be located close to each other, so textural measures derived from the spectral imagery can significantly improve classification, whereas the utility of this information decreases when dealing with freestanding trees (e.g., [
53]). On the other hand, the development of a freestanding tree is often unobstructed and it can therefore develop properly, making it often more representative for the species and easier to identify [
60].
The presence of background material in the pixel is often an important source of confusion when mapping freestanding trees. Unlike in a natural environment, the background material in an urban setting is often much more diverse, making it difficult to filter out its influence [
53,
54]. The spatial resolution of the imagery is of course important in mitigating these effects: the lower the spatial resolution, the larger the impact of mixing with background material will be.
The broadest distinction one can make in mapping trees based on tree taxonomy is between either deciduous and evergreen species or between angiosperms and gymnosperms. Both types of distinction can generally be made with high accuracy [
61], especially when including LiDAR data, due to the characteristic difference in tree crown shape [
18,
62,
63]. Within each category, the accuracy with which species can be identified may differ. Xiao et al. [
61] found that, on a physiognomic level, broadleaf deciduous species were easier to identify than broadleaf evergreen species and conifer species when using imagery captured by the AVIRIS sensor (3.5 m spatial resolution), although it should be noted that sample sizes in this study were small, the dataset was highly unbalanced and the differences in mean mapping accuracy between the different categories were limited. Higher accuracies were achieved for evergreen species by Liu et al. [
64] when using airborne hyperspectral imagery (CASI sensor) with a higher spatial resolution (1 m) in combination with LiDAR data. This indicates that very high-resolution imagery in combination with structural information seems required for mapping needleleaf trees at species level. This can be attributed to the similarity in spectral signature between these species and therefore the higher reliance on information about tree crown structure [
54,
60]. Despite a better spectral distinction between different broadleaf species, crown structure also appears to be the the most important discriminating factor for identifying broadleaf trees when fusing various data sources. Alonzo et al. [
65], using AVIRIS imagery, concluded that the highest classification accuracies are obtained for species with large, densely foliated crowns. It is beneficial if the crown is densely foliated since this avoids contamination of background material in the spectral signature of the tree [
61,
65]. Smaller tree crowns increase the risk that the pixel size of the spectral imagery is too small to avoid mixture with the background material [
52]. In the latter, the inclusion of structural information from LiDAR data can be very valuable [
18]. Another reason for the importance of a large crown size is the higher risk of a co-registration error between the reference data and the imagery or between the various data sources (usually LiDAR and spectral imagery) for smaller crowns. Of course, the between-class spectral and/or structural heterogeneity of the trees within a dataset will also influence the accuracy of the classification. More specifically, it is easier to discriminate between species of a different genus than between species of the same genus [
66].
Shouse et al. [
67] used both medium-resolution Landsat imagery and very-high-resolution aerial imagery to map the occurrence of Bush honeysuckle, a pervasive invasive exotic plant (IEP) in eastern North America. Unsurprisingly, the use of imagery with a higher resolution resulted in higher accuracies (90–95%). However, the accuracy scores obtained with Landsat imagery proved to be still reasonably high (75–80%). Important to note is that most trees in the study area were in leaf-off conditions when the imagery was captured. Chance et al. [
68] mapped the presence of two invasive shrub species in Surrey, Canada. An accuracy of 82% was achieved for the mapping of Himalayan blackberry and 82% for English ivy using airborne hyperspectral imagery (1 m spatial resolution) in combination with LiDAR-derived variables. The classification of smaller plant species comes with additional challenges; for example, the object of interest will often be located under a tree canopy, especially in densely vegetated areas. Chance et al. [
68] therefore made a distinction between open areas and areas located under a tree canopy, whereby the latter were mapped solely using LIDAR-derived variables.
3. Remote Sensing Data
3.1. Optical Sensors
A wide variety of multi- and hyperspectral sensors have been used for the classification of urban green. The utility of the imagery is determined mainly by its spectral, spatial and temporal resolution. A high spatial resolution is, in most cases, desirable to ensure that the vegetation object of interest is larger than the size of a pixel [
61]. Unfortunately, high spatial resolution often comes at the cost of lower spectral resolution, especially when dealing with satellite imagery. This is an important trade-off since, generally, the inclusion of more detailed spectral information leads to improved mapping results [
30,
69].
Certain regions in the electromagnetic spectrum are more important than others for distinguishing various types of vegetation. A detailed representation of reflectance characteristics in specific parts of the visual, NIR and SWIR regions is crucial in this regard [
54,
55]. Li et al. [
70] found that the newly added red edge and NIR2 bands of Worldview 2 and 3 contribute significantly more to the discrimination of various tree species than the traditional four bands of Worldview 1 (red, green, blue, NIR). In contrast, Alonzo et al. [
18], who studied urban tree species mapping using 3.7 m AVIRIS data, found limited discriminatory value in the NIR range due to the very high within-class spectral variability in this region. The green edge, green peak and yellow edge, on the other hand, showed a larger contrast between various tree species [
18,
23,
54,
64,
65].
In contrast to research performed in forested areas, textural information on the surroundings of the tree crown does not improve the classification results for urban trees [
53]. This can be attributed to the fact that urban trees are often freestanding. As such, the classifier will not benefit from neighborhood information [
70]. On the other hand, if the spatial resolution is sufficiently high, it is beneficial to include textural information concerning the crown structure of the tree [
60] (see
Section 3.3.1).
It should be noted that the disturbing effect of shadow plays a larger role in urban environments than in natural environments due to the 3D structure of urban areas. It is important to take the influence of shadow on the reflectance of vegetation objects into consideration, especially when mapping tree species. In a forest environment, a large tree will rarely cast a shadow over the complete crown of a smaller tree, whereas this is often the case when shadow is cast by a large building. Different authors deal with shadow in different ways, either (a) by omitting elements that are affected by shadow from the training set (e.g., [
71]), (b) by performing a shadow correction [
23,
46] or (c) by including shadowed areas as a separate class (e.g., [
53,
58,
72]).
Imagery with a High Spatial Resolution (1–5 m)
This category consists of both airborne and spaceborne sensors. The number of spectral bands and spectral regions that are captured by these sensors may vary substantially.
High-resolution imagery is used for mapping functional vegetation types as well as for species-level classification. RapidEye imagery with a 5 m resolution was used by Tigges et al. [
57], due to its relatively short revisit time, to map homogeneous tree plots using a multi-temporal dataset, indicating that a classification at the species level is possible but only for areas with multiple trees of the same genus. However, in an urban environment, one often needs to be able to map single standing trees as they make up a large portion of the urban vegetated landscape. IKONOS imagery with a resolution of 4 m was used and compared to higher-resolution imagery by Sugumaran et al. [
58] (1 m airborne photographs) and Pu and Landry [
53] (WorldView-2 imagery) for the classification of individual trees. Both authors concluded that better results can be achieved when using imagery with a higher spatial resolution, since this enables the capture of pure pixels within each tree crown. Naturally, this also depends on the tree species in question and the maturity of the trees [
61]. Lower spatial resolution can also be a limiting factor in mapping heterogeneous urban forests, due to the higher likelihood of overlap of crowns of different types of trees [
61]. Both for the detection of street trees and of trees in an urban forest setting, the use of structural information through LiDAR can vastly improve the identification of smaller trees when working with imagery at resolutions of 3.5 m or less [
18], depending on the size of the small tree crowns.
From a resolution of 3 m or higher, the mapping of individual trees becomes more feasible. Both spaceborne and airborne sensors can produce imagery at this resolution. While airborne sensors often deliver imagery with a higher spectral and spatial resolution, the capacity of satellite sensors to make recurrent measurements of the same location makes them particularly suited for multi-temporal data acquisition and mapping based on vegetation phenology, especially if fused with other types of data, such as LiDAR or aerial photography [
55,
56]. The higher spatial resolution that is often associated with airborne sensors makes airborne remote sensing an interesting source for mapping individual vegetation elements, which, on this type of imagery, extend over multiple pixels (e.g., freestanding trees). However, the increased spectral information delivered by these sensors can also be interesting for mapping other, often larger vegetation elements. Degerickx et al. [
30] and Bartesaghi-Koc et al. [
12] made use of hyperspectral imagery from the APEX and Hypex VNIR 1600 sensors, respectively, to map functional green types. Degerickx et al. [
30] demonstrated the added value of hyperspectral data (APEX, 218 bands) compared to WorldView-2 (eight bands), especially for the mapping of thematically more detailed functional classes (see also
Section 3.1.1). Although it is possible to use all bands (e.g., [
81]), the abundance of information captured by hyperspectral sensors is often condensed before it is used in a machine learning context. This can be done either through the use of appropriate spectral indices [
54,
64] or through the use of dimension reduction techniques [
30,
89] (see
Section 3.3.1).
Imagery with a Very High Spatial Resolution (≤1 m)
When considering imagery with a spatial resolution smaller than or equal to 1 m, we may be dealing with aerial photography or with multi- or hyperspectral airborne sensors. However, various satellite sensors also include a panchromatic band with a resolution below 1 m. The process of pan sharpening has become increasingly common to obtain multispectral spaceborne information at an increased spatial resolution and can also be of interest for the accurate delineation of vegetation objects in an urban context [
51,
53]. The continuous development of new pan sharpening techniques using deep learning (e.g., [
90]) has made this an interesting option; however, one needs to be aware of the potential loss of spatial or spectral information in the pan-sharpened image.
Currently, aerial photography is still the most used source for the spatially detailed mapping of urban vegetation. Despite the high spatial resolution of true-color aerial photography, there are indications that the spectral information in RGB aerial photos is too limited for vegetation mapping, even for the identification of relatively broad vegetation classes, and needs to be combined with structural information to be useful [
17]. While the use of multi-temporal RGB imagery, as provided by some commercial vendors, may aid in the identification of tree species [
56] or other vegetation types by capturing the differences in phenology between different species, aerial photography including an NIR band is used more often for vegetation mapping. Li and Shao [
51] used 4-band NAIP data for mapping broad vegetation types (forest, individual trees, shrub, lawns and crops) and obtained a good degree of accuracy (>90%) when using an object-based classification approach. For the classification of tree species, the use of very high-resolution imagery has been shown to offer unique benefits. The small size of individual pixels allows one to capture the variation within a tree crown at a more detailed level, therefore increasing the potential of defining meaningful textural features [
20,
60,
72,
88]. Puttonen et al. [
72] made an explicit distinction between the illuminated and shaded part of a tree crown, using the mean value of each part and the ratio between the two parts to train their classifier. The approach led to improved results compared to a method not making this distinction developed by Persson et al. [
91]. Iovan et al. [
20] found both first- and second-order textural features when calculated at the tree crown level to contain important information for the discrimination between two species. When the resolution of the imagery is high enough, the analysis of tree crown texture can become increasingly detailed. Zhang and Hu [
60] used imagery with a resolution of 0.06 m to derive several descriptors from the longitudinal reflectance profile of each tree crown. They showed that the longitudinal profiles contain valuable information when the spatial resolution of the imagery is sufficiently high. Additionally, this type of information appeared to have a positive influence on the robustness of the classification with regard to differences in illumination and the influence of shadow.
There are indications that combining a very high spatial resolution with a high spectral resolution can improve the mapping of tree species even further [
89]. However, so far, few studies with this type of data have been performed in an urban setting.
3.2. LiDAR
Light Detection And Ranging (LiDAR) technology can be used to infer the distance between the sensor and an object. It has been widely applied to generate detailed digital terrain and digital surface models. Various vegetation types or species have different three-dimensional structural characteristics that can be captured with LiDAR. Hence, the inclusion of LiDAR has been shown to significantly increase mapping accuracy both in an urban and a non-urban environment [
30,
56,
64,
71]. Several authors have used LiDAR as the sole source to distinguish between vegetation types or species [
62,
63]. While information about the shape of the tree is important, in functional vegetation mapping, LiDAR is mainly used to discriminate between various types of vegetation based on height (e.g., [
50,
87]). Besides LiDAR technology, height information can also be derived from stereoscopic imagery [
20,
50]. However, the use of this technology is less common for the purpose of vegetation mapping.
Various point cloud densities have been employed to map urban vegetation (see
Table 2). A higher point cloud density will generally lead to better results [
92]. LiDAR point clouds with a lower point cloud density (<10 points/m²) can provide sufficient information for the mapping of larger vegetation objects (e.g., large trees, homogeneously vegetated areas), especially when combined with spectral imagery [
12,
48,
57,
59]. Nevertheless, high-density LiDAR point clouds allow for the extraction of more complex information regarding the vegetation object. This can be especially important when dealing with small objects or, in the case of trees, high-porosity crowns [
18,
64,
68]. Moreover, for the classification of trees, the optimal point density might depend on the phenological stage of the tree, with a full canopy requiring lower density than a bare tree since the point cloud will only represent the outer shape of the tree [
62].
Table 2. Overview of the studies that include LiDAR in their analysis. Studies based on terrestrial laser scanning are not included in this table.
Average Point Cloud Density |
Vegetation Type |
Species-Level Classification |
<10 points/m2 |
[12,48,59,87] |
[55,56,57,62,71,72,80,81] |
>10 points/m2 |
[30,63] |
[18,64,68] |
It is common practice (also in a non-urban setting) to derive a range of features from the raw LIDAR data (e.g., related to the vertical profile of the vegetation), especially when working at a spatially more detailed level. However, next to geometric features, one can also extract useful information from the intensity of the return signal. Kim et al. [
62] found the mean intensity of the LiDAR return signal to be more important than structural variables in discriminating between different tree genera during the leaf-off phase using LiDAR data.
Fusion of LiDAR Data and Spectral Imagery
Combining spectral imagery with LiDAR has become a common strategy for high-resolution vegetation mapping in urban areas. Feature importance analysis by Liu et al. [
64] (mapping of tree species) and Degerickx et al. [
30] (mapping of functional vegetation types) pointed out that the structural variables derived from the LiDAR data had higher importance than the hyperspectral variables used in their analyses, especially in shadowed areas, where spectral information becomes less conclusive [
68]. Voss and Sugumaran [
71] achieved a substantial increase of 19% in overall accuracy when including LiDAR-derived elevation and intensity features in combination with airborne hyperspectral imagery for classifying seven dominant tree species. This improvement in accuracy was ascribed by the authors to the insensitivity of LiDAR data to the influence of shadow and to the inclusion of height information. In a study by Katz et al. [
56], where a higher number of different species (16 in total) was mapped using multi-temporal aerial photography in combination with WorldView-2 imagery, the added value of LiDAR was limited. Similarly, Alonzo et al. [
18] concluded that the spectral information was still the main driver of mapping accuracy in discriminating between 29 different tree species using a combination of hyperspectral AVIRIS imagery and LiDAR. The different conclusions regarding the added value of LiDAR data in these studies can be attributed to several factors, such as the characteristics of the species considered, the number of species to be discriminated and the type of spectral sensor used.
3.3. Terrestrial Sensors
Mobile terrestrial sensors gather information through a sensor mounted on a moving vehicle, usually an automotive system. As such, the observation of objects is not done from a top-down view but from a side perspective, providing additional information that cannot be gathered from airborne or spaceborne sensors. This can be very useful for analyzing vegetation that is located close to a building or vegetation in front yards [
93]. Data captured by terrestrial spectral sensors are gaining popularity for the mapping of roadside vegetation. A large benefit is the widespread availability of this type of data as they can be acquired through several online platforms, the most popular one being Google Street View. This type of data has been used to carry out virtual surveys for the quantification of street green [
94,
95] or the mapping of street trees [
96,
97]. The abundance of imagery also holds potential for the use of deep learning techniques [
98], which requires sufficient reference data to obtain accurate results. The time and date of acquisition is important when working with these types of sensors as a low Sun zenith angle causes shadows in the image, which makes the classification of objects more difficult [
99]. Additionally, although the trees are photographed from various angles, a top-down view may still contribute substantially to the correct identification of tree species. The challenge in combining street-level and top-down imagery lies in the correct matching of vegetation objects throughout various images and image types [
98].
Terrestrial laser scanning is another type of data acquisition used for vegetation mapping. Puttonen et al. [
99] and Chen et al. [
100] found this type of data useful for the mapping of tree species; however, higher accuracy may be obtained when this type of data is merged with higher-resolution spectral data [
99]. The segmentation of various objects from terrestrial point clouds remains a significant challenge on par with the actual classification of the clouds due to the large volume of data and the irregularities in the point cloud, caused by the complexity of the urban environment [
100]. A direct comparison between terrestrial and airborne laser scanning has been done by Wu et al. [
101] for the classification of four tree species in an urban corridor. In the specific setting of this study, use of the airborne platform achieved slightly better results, although, interestingly, the terrestrial data had a much higher point cloud density. Combining both data sources yielded an even better output, although the improvement was limited to an increase in accuracy of only 6% points, resulting in a total overall accuracy rate of 78%. This limited gain obtained with terrestrial data might be due to the inconsistency in intensity features caused by strongly varying incident angles and ranging distances [
101].
The combined use of laser scanning with close-range photogrammetry, which is increasingly applied in forestry applications [
102], may also offer improved results in an urban context since both methods complement each other. The difference in light source between both methods means that the depth of penetration in the canopy is different and the point cloud will show different parts of the canopy. This can mitigate the negative consequences of gaps in the laser point cloud or issues related to low radiometric quality [
103,
104,
105]. Despite good results in other fields, the combination of both approaches for the mapping of urban vegetation was not encountered in the studies included in this review.
3.4. Importance of Phenology in Vegetation Mapping
A promising way to improve vegetation mapping is by making use of multi-temporal information such that the phenological characteristics of a plant species can be taken into consideration [
57,
66,
70,
73,
79,
106]. This has become possible with the launch of satellites with a short revisit time in combination with an adequate spatial and spectral resolution, such as RapidEye or PlanetScope. Especially for the recognition of different deciduous tree genera, where different species have different leafing out and blossoming patterns [
64], the acquisition of imagery during the crucial stages in the phenological cycle has the potential to improve the mapping results [
57].
Sugumaran et al. [
58] assessed the influence of seasonality on the overall classification accuracy for distinguishing oak trees from other tree species. Fall images produced the best results. This could be attributed to a shift in the blue band caused by changes in the amount of chlorophyll pigmentation for the oak species. This is in accordance with the results of Voss and Sugumaran [
71] and Fang et al. [
66]. Voss and Sugumaran [
71] assessed the influence of seasonality on the overall classification accuracy while mapping seven different tree species using hyperspectral airborne imagery with a resolution of 1 m, concluding that, despite no significant difference in the overall accuracy when acquiring the imagery in summer (July) or fall (October), the fall results showed higher average class-wise accuracy over different tree species. Fang et al. [
66] performed a more detailed analysis, using twelve WorldView-3 images spread over the year to classify trees both at the species and the genus level. A feature importance analysis revealed that although the fall imagery provided the best separability overall, spring imagery also aided classification at the species level. Additionally, they concluded that, within the fall period, the optimal acquisition time varied depending on the tree species in the dataset. Pu et al. [
77] identified spring season (April) imagery to provide better results than all other seasons for the classification of seven tree species using high-resolution Pleiades imagery. The tree genera with a distinct phenological pattern (e.g., early leafing out of the populus genus) generally reached higher producer and user accuracy; this is also why it may be easier to discriminate between species of a different genus [
66]. Capturing imagery at the appropriate dates and thorough knowledge of the phenological stages of the vegetation to be modeled are therefore crucial [
66,
70]. Acquiring such knowledge can be challenging, especially in an urban environment, where the anthropogenic effects on ground surface temperature may be substantial and may lead to the intensification of the temporal and spatial variations in leaf development [
107]. Acquisition time also matters when using LiDAR data. Kim et al. [
62] observed an increase in accuracy when using leaf-on as compared to leaf-off data in distinguishing between deciduous and evergreen tree species.
Besides using multi-temporal data to assess the influence of the time of data acquisition on the mapping results, multi-temporal data can be used directly in the classification. Tigges et al. [
57] used five RapidEye images captured over one year (Δ1.5 months on average) to discriminate eight commonly occurring tree genera in Berlin, Germany. They observed that the overall error decreased with an increasing number of features from the multi-temporal imagery. Compared to single date imagery (from the summertime), the kappa value increased from 0.52 to 0.83. The downside of using RapidEye data is the relatively low spatial resolution (5 m), which led the authors to focus on larger, uniform urban forests. Li et al. [
70] achieved an average improvement of 11% in overall accuracy by combining a WorldView-2 and WorldView-3 image taken in late summer and high autumn, respectively, for the identification of five dominant urban tree species in Beijing as compared to only using single date imagery. A similar improvement was achieved by Le Louarn et al. [
76] using bi-temporal Pleiades imagery taken in high summer and early spring (March). Even RGB imagery may contain valuable information on the phenological evolution of plant species throughout the year. Katz et al. [
56] attained an increase in overall accuracy of 10% points (63% to 74%) by including commercially available multi-temporal RGB Nearmap imagery (eight images) in addition to a WorldView-2 multispectral image and LiDAR data for the mapping of 16 common tree species in Detroit. Another way to acquire information regarding the phenological profile of different tree species is by using terrestrial imagery taken at specific intervals throughout the year. Abbas et al. [
108] achieved accuracies of up to 96% for the mapping of 19 tree species with bi-monthly hyperspectral, terrestrial imagery.
4. Concluding remarks
Typologies used for mapping urban vegetation vary widely among scholars, depending on the intended use of the map product. Nevertheless, a distinction can be made between studies focusing on the mapping of functional vegetation types, linked to their role in the urban ecosystem, and~taxonomy-based vegetation mapping, the~latter being mainly concerned with the mapping of tree species or genera. The~overview of studies highlights the potential and the limitations of different types of spaceborne, airborne and terrestrial sensors for urban vegetation mapping, both in terms of image acquisition technology and in terms of sensor characteristics (spectral, spatial and temporal resolution). It also demonstrates the merits of combining different types of sources, with each data source providing complementary information on the biophysical and structural characteristics of the~vegetation. With~the growing awareness of the role of urban vegetation as a provider of multiple ecosystem services, and the increasing number of complementary data sources available for urban mapping, applications in the field of urban vegetation mapping are likely to grow rapidly in the coming years. Currently, most taxonomy-based mapping efforts lack sufficient accuracy and completeness to warrant their use in detailed ecosystem service analysis studies. Nevertheless, new developments in imaging technology and data science offer great promise for the production of virtual urban green inventories, supporting the management of green spaces at the city-wide~scale.
This entry is adapted from the peer-reviewed paper 10.3390/rs14041031