2. Importance of Wetland Water-Level Monitoring
Wetlands are usually found in low-energy domains, resulting in slow water flowing. This is because the land surface in these areas is relatively level
[21][47]. Because wetlands are found in relatively leveled terrain, their surface area can be expanded and contracted as the water level changes, allowing a large quantity of water to be stored
[15]. Fluctuations in wetland water levels are an important scenario as it improves the productivity and the biodiversity of the wetland areas
[22][48]. Water level, hydro patterns, and residence time are the three key elements that can be used to identify the hydrologic behavior of wetlands
[15]. Subtle changes in water levels can have a significant impact on vegetation patterns, characteristics, and ecological processes in wetland habitats. Therefore, the water level and the associated vegetation cover can be used to determine water levels during drought, flooding, and normal conditions
[23][49].
Wetlands are responsible for 20–25% of methane emissions into the atmosphere; however, they absorb a significant amount of carbon dioxide. Wetland water levels play a vital role in controlling methane emissions by functioning as an interface between aerobic and anaerobic processes and determining the degree of carbon dioxide production
[24][25][50,51]. In addition, wetland water levels reflect the dissolved oxygen conditions in the wetland’s soil–water system. The higher the wetland water level, the lower the dissolved oxygen concentration in the soil
[15]. Anaerobic conditions are quickly developed in soils that are saturated rather than unsaturated soils, as the oxygen solubility in water is less. The amount and type of sediment–water nutrient exchange is affected by the frequency of water-level fluctuation, duration, and magnitude
[26][52]. Therefore, the availability of water affects soil oxygen concentrations, which will adversely affect plant growth.
In addition, as a result of the water-level fluctuations, a direct impact on the plant and animal communities can be witnessed
[27][53]. A case study done by Wilcox and Nichols
[28][54] in the Lake Huron wetland has found that water-level fluctuations have an impact on the biodiversity and territory value of wetland plant communities. Therefore, water levels in wetlands are crucial to their survival and for the maintenance of the ecological balance of flora and fauna in wetlands. The species associated with wetlands have preferred water depths for their existence. Furthermore, some of the wetlands are situated along river basins and function as flood-detention basins. Those ecosystems generally fulfill a major task in managing flash floods that may happen due to extreme weather conditions. As such, water-level prediction and monitoring must be done to calculate the water-flowing depths downstream to prevent natural disasters such as floods
[29][55]. Therefore, water-level measurement and forecasting will be more significant in wetland conservation and management
[15][30][15,56]. It was observed that wetland water-level fluctuations are dependent on the seasonal and annual variation of climatic conditions. Therefore, evaluating water levels will be more applicable in forecasting varying climatic conditions from time to time
[31][57]. For this purpose, models can be used to simulate and forecast wetland water levels when there will be a necessity to do so in decision-making relevant to wetlands or any other weather forecasting
[32][36].
3. Available Machine-Learning Techniques to Predict Wetland Water Levels
Wetland water levels can be predicted in several ways, including physically based and data-driven approaches
[33][58]. Physically based approaches can increase the level of complexity, are time-consuming to develop and require a high level of knowledge in the relevant field
[16]. There are hydrologic and hydraulic models such as the Hydrologic Engineering Center’s River Analysis System (HEC-RAS), the Soil & Water Assessment Tool (SWAT), and MIKE, which can be used to simulate water levels
[34][59]. Nevertheless, the major drawback with those methods is that they need a proper understanding of hydrological processes and the variety of data related to inflows and outflows, bathymetry data, meteorological data, etc.
[35][60]. Moreover, model development and calibration are more challenging when limited data are available
[36][61]. However, machine-learning techniques can overcome most of these difficulties in predicting water levels in wetlands
[37][62].
The data-driven machine-learning approach is a very effective technique, as it can be applied in many nonlinear scenarios such as water-level forecasting, sediment transportation, water-quality prediction, groundwater modeling, etc.
[38][63]. Change in the water level is a complex hydrological phenomenon, as there are many controlling factors
[39][64]. In such cases, decision-making is challenging. In contrast, traditional prediction techniques are incapable of achieving the desired research purposes with the unavailability of large-scale data
[40][65]. Therefore, machine-learning techniques possess many advantages that include implementation simplicity, rapid running speed and convergence, and strong adaptability
[41][66]. Therefore, the machine-learning technique is one of the ideal tools for most complex situations
[16].
Artificial neural networks (ANN), kernel methods, radial basis function (RBF), and support vector machines (SVM) have mainly been identified as commonly used machine-learning techniques in water-level predictions
[16][42][43][16,67,68]. However, hydrological predictions using computer-based models can produce uncertainties and the results can differ from model to model
[44][69]. Therefore, selecting a convenient machine-learning technique is a challenging task because the purpose of different techniques is not similar. Typically, the availability of the data can be taken into consideration as the key element to construct a learning algorithm in wetland water-level predictions
[45][70].
Artificial neural network (ANN) models are very effective for hydrologic systems, as they can build up relationships from the given data
[46][71]. McCulloch and Pitts
[47][72] were considered the pioneers of the concept of the artificial neural network
[48][73]. They imitated the functions of the human brain which connects several neurons
[49][74]. With weighted connections, these neurons are organized into two or more layers
[50][75].
Figure 1 shows a simple architecture of an artificial neural network for wetland water-level prediction. It consists of three layers including an input layer, a hidden layer, and an output layer. The network is initially trained using the known hydrological parameters and known water levels. Then, the trained network can be used to predict the unknown wetland water levels using the known hydrological parameters. The number of hidden layers may be increased depending on the problem.
Figure 1.
Layers in artificial neural networks.
The model runs to find the solution to the following mathematical function, which is time-dependent (refer to Equation (2)).
where
YY is the dependent variable on time and
Xi,Xi, is the independent variable based on the time domain. The nonlinear relationship is formulated as per Equation (2) in ANN. Many optimization algorithms, including the Levenberg–Marquardt algorithm (LM), the scaled conjugate gradient (SCG) algorithm, the Bayesian regularization (BR) algorithm, etc. are used in enhancing the performance of the developed ANN model
[51][52][76,77].
Support vector machine (SVM) is another popular machine-learning technique that can be used to predict water levels, which is based on artificial intelligence that has been developed on statistical learning theory
[41][66]. The SVM identifies support vector hyperplanes that can linearly group the vectors of various classes with a maximum distant margin between them
[16]. SVM operation is carried out with the assistance of kernels. Although the accuracy in the neural networks depends on the number of nodes in the hidden layer, the accuracy of the support vector depends on kernel mapping. Polynomial, sigmoid, and radial basis functions can be used in this manner
[16]. Nevertheless, the radial basis function (RBF) can be considered the best kernel function used in water-level predictions, and it gives a globally optimal solution while avoiding overturning
[53][78]. Equations (3) and (4) present the mathematical formulation of SVM in generic forms. The regression function used in SVM can be formulated as Equation (3).
where
∅(X)∅(X) is a nonlinear function that is used to map the input vector to a high-dimensional space.
ω is the weight vector and b is the bias. Minimizing the structural risk function (given in Equation (4)), the mapping function is estimated.
where
N is the sample size and C finds the tradeoff between model complexity and empirical error.
Lε is Vapnik’s
ε intensive loss function.
Random forests are another machine-learning approach and consist of a collection of “m” number of tree predictors. They are produced by randomly selecting the variables from separate categories
[54][79]. Nevertheless, when there is a huge number of trees in the model, issues can be raised due to overfitting. This issue can be overcome by selecting the number of trees that gives the lowest mean square
[16]. Random forests can operate not only with nonlinear data but also with non-Gaussian data. Additionally, the relative importance of each variable can be measured in this technique, which utilizes variable selections
[55][80]. Some other features of random-forest models are that they are less sensitive to outliers and noise, provide useful internal estimates of error, are faster than bagging, correlation, strength, and variable importance, and are simple and easily parallelized
[54][79]. This method was also applicable to many water-related studies, including wetland water-level prediction
[56][81]. The schematic diagram of a generic random-forest approach is given in
Figure 2. As stated, there can be n number of trees for decision-making. After combining all decisions, the final decision or result is estimated.
Figure 2.
Schematic diagram of a generic Random-Forest approach.