Cloud-Based Remote Sensing for Wetland Monitoring

Cloud-Based Remote Sensing for Wetland Monitoring: Comparison

Please note this is a comparison between Version 1 by ABDALLAH YUSSUF ALI ABDELMAJEED and Version 2 by Dean Liu.

The rapid expansion of remote sensing provides recent and developed advances in monitoring wetlands. Integrating cloud computing with these techniques has been identified as an effective tool, especially for dealing with heterogeneous datasets.

wetland
peatland
cloud computing
remote sensing monitoring

1. Which Cloud Computing Service Model Has Been Utilized in Wetland Monitoring?

All the papers included in this review used PaaS as a cloud computing model. PaaS is a development platform allowing researchers to develop and build RS-based applications directly on the PaaS cloud. Thus, SaaS and PaaS differ in that SaaS only hosts cloud applications that have already been developed, whereas PaaS offers a development platform that hosts both cloud applications that have already been developed and those that are still in development. PaaS must have a development infrastructure, including a programming environment, tools, configuration management, and other components, and an environment supporting application hosting to achieve this.

An example of PaaS is Google Earth Engine (GEE), which is most commonly used to implement the study and achieve the goal of CC ^{[1][2][3][4][5][6][7][8][9]}[22,25,26,27,28,29,30,31,32]. GEE is a freely available platform with open-sourced scripting frequently used to automate the RS data for the area of interest, mainly at a big scale. It has a vast dataset for the world, mainly from satellites with a variety of spatial resolutions, from coarse (1 km, MODIS) to high (10 m, Sentinel 2), which are good enough for monitoring wetlands at different scales, from the local and regional to the continental and global scales. At a local scale, which is often the wetland level where some research infrastructure and experimental sites with small plots are located, wetlands can be monitored either by application of CC on the SaaS model (time and cost-efficient) or by the traditional methods, i.e., carrying out the ground-based measurements using the instruments in the field (time and not cost-efficient). However, using dedicated CC-based software, data from the fixed sensors in the study area can be read and analysed, and then quantitative and visualized results can be shown in real time. These time and cost-savings help researchers and professionals closely track the wetland status and minimize potential risks (e.g., fire) and maintenance costs. PaaS and RS can integrate a vast quantity of data, tools, and programs; then, by linking many wetlands with the same technology, global monitoring and tracking of the changes of those study areas will be possible and straightforward, thanks to Digital Twin, IoT, and CC.

2. How Widely Utilized Are the Different Monitoring Applications of Remote Sensing Data on Wetlands Using Cloud Computing, and What Are Their Limitations and Accuracy?

The requirements of the data needed to monitor an ecosystem depend on the purpose of its monitoring. Five types of monitoring purposes have been selected as subclasses of wetland monitoring in the articles analysed (Table 1).

Table 1. The subclasses of wetland monitoring strategies in the analysed articles.

Monitoring Strategies	References
Prediction (1 article)	^[10][33]
Time series analysis (6 articles)	^[5]^[7]^[11]^[12]^[13]^[14][28,30,34,35,36,37]
Mapping (11 articles)	^[8]^[12]^[15]^[16]^[[23¹⁷,24,31^],35^[18,38^][19,39^][20,40^][21,41^][22,42^][23,43^],44]
Classification (15 articles)	^[1]^[4]^[24]^[25]^[26]^[27]^[28]^[29]^[30]^[31]^[32]^[33]^[34]^[35]^[36][9,17,22,27,45,46,47,48,49,50,51,52,53,54,55]
Change detection (17 articles)	^[2]^[3]^[6[25^{][37][38][39]}^[40],26^[,29⁴¹,56,57^],58,59^[,60^42],61^[43],62^{[44][45][46][47][48][49]},63^[,64⁵⁰,65,66^],67,68,69]

In most of the studies, satellite data is the most used; it has been used alone or in combination with other types of remote sensing data (airborne, UAV, or in situ), independent of the monitoring strategy applied (Figure 1). The satellite data’s widespread use is thanks to the open-access products of space agencies such as the European Space Agency (ESA) and the National Aeronautics and Space Administration (NASA), providing products with spatial resolutions of 10 m (Sentinel 2) or temporal coverages since 1972 (Landsat 1). These products were mostly used for the detection of changes in wetlands (34% of articles), wetland classification (30% of articles), and wetland mapping (22% of articles).

Figure 1. Distribution of the number of articles (y-axis) per monitoring strategy for each type of remote sensing data.

Only one article focused on predicting changes in wetlands, where a combination of in situ measurements and satellite data were applied to monitor plant phenology. However, weather conditions such as fog, made it impossible to use historical data to predict seasonality ^[10][33]. Additionally, climate change, with the variation in weather and seasonality, makes it more difficult to use historical data to predict future plant phenology as the future climate is unknown and unpredictable in the long term. Thus, approximately 12% of the articles primarily analysed time series data derived from satellite products. Three of them indicated problems with the number of images available ^[13][36], their spatial resolution ^[7][30], or the high level of moisture affecting vegetation index performance ^[14][37], but the average accuracy surpassed 90%. The accuracy refers from this point to the values provided by the authors in each publication, where accuracy is calculated from the percentage of hints compared to an already published result in the form of Corine land cover or ground control points grouped and averaged. The satellite products’ low number of images and low spatial resolution was also a problem faced in combining satellite and airborne data ^[11][34]. Differences among wetland sites were identified as the next source of uncertainty when multiple watersheds or wetlands are studied combined and treated equally, where the addition of in situ measurements is recommended to avoid errors ^[5][28]. However, in situ data in combination with low spatial resolution satellite products has reported the problem of pixels being too big ^[13][36], indicating that the combination with in situ data may still require high-resolution imagery either airborne or satellite. Furthermore, the accuracy was not reported in the studies combining airborne with satellite, and performing time series analysis ^[2][4][15][23,25,27]; hence, this SLR could not evaluate the accuracy.

In the case of wetland mapping, satellite data cannot provide enough information to differentiate among wetland classes, as it occurred for bogs and fens ^[21][42], and wetland types ^[9][23][32,44]. Another problem derived from satellite spatial resolution for mapping is the inability to distinguish water bodies, often masked by highly dense vegetation ^[15][22][23,43], or the inability to differentiate small fires; they are accounted for as giant fires, clustering several of them instead of treating them as small individual ones when burn severity is studied ^[18][39]. Five out of seven studies using satellite data for wetland mapping specifically reported problems with the spatial resolution of the imagery. In the only article using satellite data in combination with in situ measurements, a problem was reported with skewed data: a non-representative set of ground-based values was overestimating high values of soil organic carbon, for example, and underestimating low ones ^[20][41]. With the use of UAV data, the spatial resolution increased; however, when combined with satellite, the results faced issues with noise-free images to monitor phenology ^[16][24]. When combined with airborne data, multi-source satellite data were still needed for mapping due to the structural heterogeneity of some wetlands, such as peatlands ^[19][40].

Additionally, the low availability of airborne images per year could not capture rapid changes, such as inundation status, to map these processes ^[8][31]. Using satellite data with a higher resolution to validate results ^[17][38] can result in overestimating the actual accuracy (93.2% accuracy estimated), as it should be obtained from the comparison with ground-based data and not another remote sensing dataset. High-resolution data from UAVs or airborne missions have time, scale, and sometimes even price limitations, so their use also faces challenges ^[42][61]. Satellite data used for change detection analysis on wetlands faced the same problems with the heterogeneity and density of wetland vegetation, i.e., mapping, although the average accuracy of the results is higher (Table 2). However, the result can be biased because of the difference in the amount of articles for each monitoring type (Figure 1). The difference in spatial distribution and the spectral similarities among types of wetlands also caused challenges in change detection studies as it was impossible to select a specific shape and spectral indices to perfectly extract the changes over time ^[43][45][47][62,64,66]. The same can be observed when combining airborne and satellite datasets, ^[6][44][29,63] and satellite and in situ ^[46][65]. In contrast, for wetland mapping, the main challenge faced within change detection studies on wetlands was the processes of rapid changes caused by extreme events and strong seasonality of wetlands, with satellites not providing a high enough temporal resolution to monitor them ^[37][42][48][56,61,67]; the same occurred when combining it with airborne data ^[40][59]. The harmonic time series analysis can be used as a solution ^[49][68], increasing the accuracy from the average of 89% to 93.35% or by combining with UAV imagery, with 92% accuracy ^[50][69]. It is worth noting that the utilization of higher resolution satellite images for validating the results can potentially lead to an overestimation of accuracy. This is exemplified by the inundation and detection disturbances in land cover and their study of associated changes, which achieved a validation accuracy of 91.1% when using imagery from a private satellite, as opposed to in situ data ^[38][57].

Table 2. Average accuracy per monitoring strategy for each type of remote sensing data. Note: airborne data were excluded, as only one article used this data type, and the accuracy was not reported.

Monitoring Strategy	Satellite	Satellite + Airborne	Satellite + In Situ	Satellite + UAV	Total
Prediction	NA *	NA *	67%	NA *	67%
Time series analysis	94%	NA *	85%	NA *	91%
Mapping	82%	94%	83%	94%	86%
Classification	84%	NA*	97%	NA *	85%
Change detection	89%	86%	89%	92%	89%
Average total	85%	91%	87%	93%	86%

* NA= not available.

Wetland classification was the second most used monitoring approach, with 15 articles, and was only surpassed by change detection. While not all articles in the remaining sections provide estimates of result accuracy due to insufficient ground-truth measurements, it is noteworthy that all classification analyses carried out in these studies include such estimates. The spectral similarities among wetland types or types of vegetation were again a source of errors (accuracies: 86–96%) ^[4][27][27,46]. However, the heterogeneity and complexity of the wetland ecosystem represented a more significant source of errors (with lower accuracies: 77–88%) ^[24][25][9,17], even with water bodies masked by dense vegetation ^[34][53]. The use of harmonic models can decrease the effect of these errors with accuracies of up to 91% ^[30][49]. High accuracies have been reported as average, though; sometimes, this is due to the use of other satellite-based classifications to validate the result instead of in situ data ^[28][47]. A lack of ground-truth data can affect the validation of the study, decreasing the overall accuracy (72%) ^[35][54]. Similarly, training data are needed when applying deep learning techniques. If the training data are not enough, the classification out of the algorithm shows a lowered accuracy (68.7–77.1%) ^[1][26][22,45]. Using a satellite topography data approach to classify wetlands dropped the accuracy of a land cover classification (81.85%) due to the heterogeneity and complexity of wetlands ^[33][52]. However, some benefits are added when simultaneously merging synthetic aperture radar (SAR) and optical remote sensing data (82.7% accuracy), as the number of variables used for classification increased ^[31][50]. The addition of multiple satellite sources (>4), including private ones with high resolution, especially in combination with ground-truth data, allows wetland classifications with the high level of resolution needed (e.g., for studying floristic composition). These analyses perform with very high accuracy (96%), and an increase in the number of in situ variables is suggested to approach even higher accuracy ^[32][51]. The combination of ground-truth data and satellite images, including in situ data, can reach up to 97% accuracy in classifying vegetation types on wetlands, despite some confusion between wetland and non-wetland vegetation ^[29][48].

The same sources of uncertainty are usually present independently of the type of monitoring used. The only exception appears with temporal-coverage-related uncertainties present for those using time series analysis, predictions, and detection of wetland changes. In most cases, the overall accuracy has been observed to increase over time due to enhancements in the quality of satellite data. For instance, in a study of land cover changes in a Chinese swamp, estimation accuracies improved from 82% in 1984 to 92% in 2018 ^[41][60]. However, no remote sensing data source can perform perfectly by itself. Additionally, applying robust algorithms in large areas implies a large use of computational resources ^[36][55]. Consequently, to better assess the uncertainties and improve the monitoring performance, a combination of automatic in situ meteorological stations and satellite and airborne/UAV data, validated with enough reliable in situ measurements using cloud computing services such as GEE, is recommended.

3. Which Monitoring Strategies Were Performed Using Cloud Computing Technology?

When deciding on the type of results, research can be classified according to two approaches: holistic and atomistic. The scale in which the area is analysed determines whether a generalized perspective is used (holistic), or a specific perspective is applied (atomistic). The number of details inside an area or the area covered is often prioritized. Thus, the type of wetland studied will vary with the scale. Minor scales usually focus on a particular type of wetland (e.g., marsh, bog, fen) as the limited area will include only that specific environment. With an increase in scale, the probability of identifying one type of wetland inside the area decreases, and the focus of the study will probably switch from specific vegetation identification to wetland delineation (Figure 2). From the articles reviewed, these two main approaches can be distinguished:

Figure 2. Percentage of papers depending on the scale used and type of wetland plus their distribution based on the relationship between these two parameters.

Larger areas with a regional or national scale, including more than one type of wetland;
Smaller areas focused on a specific protected area with no more than two types of wetlands.

The type of results varies depending on ecosystem characteristics, as the type of vegetation ^[2][30][37][25,49,56], water surface changes ^[39][48][58,67], or even burn severity ^[18][39] are most often limited to local scale studies. Wetland delineation ^[44][45][63,64], distinguishing types of wetland ^{[6][21][24][26][36]}[9,29,42,45,55], land use ^[46][65], or carbon content in the peat ^[20][41] are usually completed for larger scale studies. The performance of cloud computing has been suitable in each of the distinct scale–wetland combinations, although the challenges of using remote sensing and cloud computing vary. At the regional scale, the main challenges reported were the similarities between wetland types due to their spectral similarities and spatial heterogeneity. The spectral similarities produced higher confusion among wetland types, with an accuracy for bogs of 86%; reduced to 80% for fens ^[45][64]; and 80% in saline marshes studied in China ^[46][65]. In a study in the Great Lakes, these spectral similarities induced confusion among wetlands and uplands ^[4][27]. Small and highly vegetated wetlands as potholes masked water bodies ^[22][43]. Spectral similarities did not allow the differentiation of herbaceous vegetation ^[25][17]. The spatial heterogeneity of peatland constrained the results, not allowing the proper distinction of bogs and fens in Canada and reducing the accuracy to 69% ^[21][42]. Due to spectral similarities and spatial heterogeneity, the accuracy was reduced to 77% in a Canadian wetland inventory map ^[24][9].

Furthermore, the difference in shapes and distribution made monitoring alpine wetlands and swamps in the same study complex ^[43][62]. The same occurred with bogs, fens, swamps, and marshes on the island of Newfoundland ^[6][29], although bogs showed the highest producer accuracies between 92% and 97%, and fens had the highest user accuracies between 66% and 86%. At the national scale, the challenges were the lack of adequate data with high resolution ^[26][45] and low noise ^[16][24], or both ^[7][30], and the high computational resources for such large datasets ^[36][55]. The local-scale studies also faced challenges due to spectral and spatial features. The use of NDVI in wetlands can be compromised by the high moisture content, making it challenging to acquire the best results ^[14][37]. Problems with the moisture or water table level were commonly faced at this scale ^[33][52] or due to its strong seasonality and fast changes ^[12][48][35,67]. The high heterogeneity of wetlands and spectral similarities, together with the moisture level, makes it almost mandatory to use multi-source approaches at local scales ^[19][40]. Apart from this, the same spectral similarities that do not allow the fine distinction among wetland types at a regional scale could make the reproducibility of algorithms hard when they are based on a particular type. Moreover, although high accuracy (96.44%) was displayed at a specific local scale, there is a large possibility that it will be restricted to a particular type of wetland ^[27][46]. The confusion occurs due to several uplands and the small areas covered by peatland, making monitoring difficult at the local scale ^[23][47][44,66].

For the excellent performance of the CC models, the generalization of scale adequacy for the CC methodology and its accuracy need to be evaluated. The decreased complexity in the analyses and the lower number of studies at the national scale resulted in the lowest accuracy among all the scales (Table 3). The second lowest was at the local scale, but a more comprehensive range of values for the standard deviation is shown in Table 3. Different monitoring techniques and distinct types of wetlands can be found inside each scale, and usually, the highest deviation is shown inside the groups with the most significant number of papers. This is why almost no deviation occurs at the national scale, as only four papers were found in this group. On average, individual (no more than two types) and mixed type of peatland studies show similar accuracies, 86.0% ± 41.0% and 86.9% ± 22.0%, respectively, but with a different number of papers analysed for each category (33 articles focused on individual types of wetlands and 18 mixed). As previously noted, the scale of wetland studies is often determined by the type of wetland under investigation, whether individual or mixed.

Table 3. Accuracy distribution depends on the study’s scale, with the average represented as the average value ± standard deviation.

	National	Regional	Local
Average	81.8 ± 9.6%	86.9 ± 30.3%	87 ± 42.5%
Max	94%	98.2%	98%
Min	71%	69%	67%

Consequently, authors choose their study targets based on the specific wetland type, and their selection depends on the most suitable location for their research objectives. For this reason, accuracies are similar despite the number of types of wetlands studied. However, an accuracy increase will be expected when only one type of wetland and different performeances depending on the type of wetland (with higher standard deviation as the results) are studied.

Random forest (RF) has been the most widely used when considering the machine learning method (ML) applied, with 19 articles applying it ^{[4][5][9][16][19][23][24][25][26][28][29][30][31][32][36][37][42][45][46]}[9,17,24,27,28,32,40,44,45,47,48,49,50,51,55,56,61,64,65] (Figure 3). Ordered from the most to the least repeated, the other ML techniques used have been classification/regression trees (accuracy = 79.2%, ^{[12][21][33][41]}[35,42,52,60]), clustering (accuracy = 98%, ^[8][47][48][31,66,67]), support vector machines (SVM, accuracy = 93.4%, ^[7][44][30,63]), and artificial neuronal networks (ANN, accuracy = 96.4%, ^[27][46]). However, when the unique method used was not RF, the authors preferred a mix of multiple ML techniques with lower accuracies than RF reported as the average of all methods applied (accuracy = 85.9%, ^{[1][2][6][20][22][35][49][50]}[22,25,29,41,43,54,68,69]). On the other hand, not all the authors considered ML the technique needed, and index thresholding ^[12][14][19][35,37,40], object-based image segmentation ^[17][38], trend analysis ^[14][40][37,59], or regressions ^{[10][34][38][43]}[33,53,57,62] have been successfully used for peatland monitoring (accuracy = 88.8%, ^[3][15][39][23,26,58]). Because not all papers have reported accuracies, and the number of papers between classes is not comparable, the analysis of the success of each technique cannot be assessed. For example, not using ML presents almost the same average accuracy as RF because only six articles from the thirteen included in this group reported this value, while all the authors provided accuracy when using RF. ANN and SVM methods present the highest accuracies; this is not surprising considering that only three articles used these techniques ^[7][27][44][30,46,63].

Figure 3. Distribution of the method used and average accuracy. The long-dash line indicates the accuracy scale from 0% to 100%; the dot marks the average accuracy (ANN = 96.4%, SVM = 93.4%, Clustering = 98%, Classification/Regression Tree = 79.2%, Mixed ML = 85.1%, NO ML = 88.8%, and RF = 85.9%).

4. What Economic Gains Can Be Realized from Integrating Cloud Computing and Remote Sensing Data in the Monitoring of Wetlands?

Cloud computing provides benefits at two levels: the first is scaling, as the user organizations save money because they purchase the cloud computing-related resources in massive quantities at lower costs, and thus can provide the services to end users at a lower cost. The second is the global reach of the companies/organizations, which also increases by using cloud computing. As a result, the end users can avoid the substantial up-front capital expenditure costs of purchasing their expensive infrastructure. As in other fields, scientists, companies, and organizations related to wetlands also benefit economically from implementing cloud computing and remote sensing data (Table 4).

Table 4. Cloud computing- and remote sensing-highlighted benefits and the different economic factors they influence in wetlands.

Factors	RS—Without CC	Benefits Due to CC + RS
Resolution	Differs	Differs
Coverage	Varies	High
Capital expenses	High	Less
Cost	High	Less
Time	Long	Less
Human resources	High	Less
Global Reach	Limited	High

Wetlands are ecosystems with very high productivity; thus, they are considered among the most economically valuable ecosystems for society ^[51][70]. Wetlands are ecosystems that offer a diverse array of ecosystem services and are regarded as vulnerable systems that exhibit rapid responses to alterations in the surrounding environment ^[52][71]. Unfortunately, in the last decades, wetlands have been lost worldwide ^[53][72], thus impacting the financial services they provide. Economic valuations of wetland services may provide a better understanding of the loss for an organization and government, but due to wetland location, cost, and time, a field survey is generally not a viable option, especially for poor or developing countries. Due to their significantly lower cost, time, and ability to monitor a large area of wetlands and their resources, cloud computing and remote sensing data may play a significant role in economic decision making by policymakers and stakeholders. Using GEE, researchers reported a significant loss in semi-arid southern African wetlands due to unsustainable use and poor management ^[54][73], thus pushing the authorities to act differently to preserve their resources. The authors also showed how cloud computing platforms might offer unique significant data handling and processing opportunities for scientists or workers with limited resources. Thus, economically favourable policies can be created for a wetland ecosystem using cloud computing and remote sensing.