Hybrid Artificial Intelligence in Groundwater

Hybrid Artificial Intelligence in Groundwater: History

Please note this is an old version of this entry, which may differ significantly from the current revision.

Subjects: Water Resources

Contributor:

As one of the world’s most valuable and vital water sources, groundwater is integral to many facets of human life, including food production, economic growth, and safe drinking water. Developing precise soft computing methods for groundwater management, which includes quality and quantity, is crucial for improving water resources planning and management. Significant progress has been made in groundwater management using hybrid machine learning (ML) models as artificial intelligence (AI).

hybrid machine learning
groundwater management
performance models

1. Introduction

Hybrid artificial intelligence (AI) models combine different AI techniques to enhance the accuracy and robustness of predictions for groundwater quality and quantity management [109]. Traditional modelling approaches have limitations in capturing complex nonlinear relationships between input and output variables, and hybrid AI models have emerged as promising solutions to overcome these limitations [110]. Hybrid AI models can improve the accuracy and reliability of groundwater forecasting by integrating multiple AI techniques, such as neural networks, fuzzy logic, and support vector machines, leading to more informed decision-making for groundwater quality and quantity management and ultimately contributing to sustainable use and protection of this vital resource. The use of hybrid artificial intelligence techniques in groundwater quality and quantity management has shown promise in improving prediction accuracy and optimising management strategies.

2. More Common Hybrid AI Models

2.1. Artificial Neural Networks (ANN) and Support Vector Machines (SVM)

The high citation number approves that ANN, and SVM are popular hybrid machine learning techniques widely used in groundwater sciences. One of the main advantages of ANNs and SVMs is their ability to model complex relationships between variables, which is often difficult to achieve using traditional analytical and numerical modelling approaches [111,112]. They can also handle large amounts of data and are relatively fast and efficient [113]. Another advantage of these techniques is their flexibility, as they can be used for a wide range of applications, from predicting groundwater levels and flow rates to identifying potential sources of contamination. However, ANNs and SVMs have their limitations. One of the main challenges of using these techniques is the need for large amounts of high-quality data to train the models, which can be costly and time-consuming to collect [114]. In addition, these models are often considered black boxes, meaning it is difficult to understand how they arrive at their predictions [115]. This lack of transparency can make it challenging to interpret the results and may limit their usefulness in decision-making processes. Another disadvantage of ANNs and SVMs is the potential for overfitting, which occurs when the model is too closely fitted to the training data and performs poorly on new data [68]. It can be addressed using appropriate techniques such as regularisation and cross-validation [116]. In conclusion, while ANNs and SVMs offer significant advantages for modelling groundwater systems, their limitations should be addressed to ensure accurate and reliable predictions.

AI models’ overall accuracy and performance for various applications in hydrogeology issues have shown promising results after integrating ANN and SVM as hybrid AI. For instance, researchers used a hybrid AI model comprising ANN and SVM to aim to build a unique ensemble model based on a high-resolution groundwater potentiality model [117]. Using ROC curves confirms that the hybrid model outperformed (around 10%) than ANN and SVM models individually. Also, an ANN-SVM hybrid AI model was used for groundwater level prediction in urban areas [118]. They reported that the hybrid model improved the prediction accuracy by up to 62% compared to one based model. However, the degree to which the hybrid AI model improves accuracy and speed may be context- and problem-specific.

2.2. Genetic Algorithm (GA) and Artificial Neural Networks (ANN)

GA and ANN are two popular artificial intelligence methods used in groundwater quality management [105,119]. ANN is a type of AI that mimics the structure and function of the human brain to process information. It consists of interconnected processing units that receive input data and output a prediction or decision [120]. ANN is particularly suitable for groundwater quality management because it can handle large amounts of complex data, including uncertain and imprecise data [116,121]. It can also adapt to changing conditions and learn from experience, making it a useful tool for predicting groundwater quality. As mentioned earlier, GA optimises the parameters of models used in groundwater quality management. The advantages of genetic algorithms include their ability to search an ample parameter space efficiently and handle nonlinear relationships between variables. Using these methods as a hybrid AI, ANN, and GA can complement each other by providing a powerful tool for modelling and predicting groundwater quality [121,122]. However, the disadvantage of this approach is that it requires a considerable amount of data and computing power, which can be a challenge in some applications [121,123]. Additionally, the results of this approach may be difficult to interpret, making it challenging for decision-makers to understand the basis for their decisions. Nonetheless, the advantages of using ANN and GA as hybrid AI in groundwater quality management outweigh their disadvantages, making them an essential tool for improving the management of groundwater resources.

For an instance of the integration of GA and ANN as a hybrid, AI is seen in the study by Pandey et al. [124], where they employed the GA-ANN hybrid approach for predicting seasonal groundwater table depth. The study reported a significant improvement of around 43% in R² compared to the individual models. Another study assessed the development of hybrid ANN models and their critical assessment for simulating groundwater levels at 17 sites in an alluvial aquifer system [125]. According to the findings of this study, the hybrid model was identified as the most efficient method for predicting spatiotemporal fluctuations of groundwater at almost all of the sites, with the Nash-Sutcliffe efficiency ranging from 0.828 to 0.998.

2.3. Wavelet Transform (WT) and Artificial Neural Networks (ANN)

Generally, Wavelet Transform (WT) and Artificial Neural Networks (ANN) have emerged as powerful groundwater forecasting and modelling tools. The WT-ANN model has been applied in various groundwater studies, including predicting groundwater levels [126,127], identifying trends and patterns in groundwater data [128], and modelling groundwater recharge [129]. One advantage of this hybrid model is its ability to handle nonlinear relationships between the input and output variables, which is common in groundwater systems. Moreover, wavelet analysis can help identify important frequency components in the data, improving the accuracy of the predictions [127]. Despite its advantages, the development of a WT-ANN model can be a challenging task. The model requires a large amount of data for training, and selecting appropriate wavelet basis functions can significantly impact its performance.

Additionally, interpreting the results can be challenging due to the black-box nature of the ANN component [115]. Therefore, it is essential to carefully design and optimise the model to achieve the best results. Overall, the WT-ANN hybrid AI model has shown promising results in groundwater applications and has the potential to improve our understanding of complex groundwater systems. However, it is crucial to investigate this model’s strengths and limitations and identify ways to optimise its performance in different hydrogeological settings. The use of hybrid AI models, such as the WT-ANN, can significantly advance the field of groundwater sciences and contribute to sustainable management and protection of this vital resource.

Many researchers used WT and ANN as HA integrated into groundwater for prediction and modelling purposes. For example, to predict the groundwater levels of a dry inland river on multiple scales, Wen et al. [127] tested the efficacy of a wavelet analysis-artificial neural network (WA-ANN) conjunction model. They hypothesised that the WA-ANN model would be especially useful for predicting the intricate dynamics of groundwater level variations. A related study [130] assessed the performance of the hybrid WA-ANN approach in predicting the quality of shallow groundwater using the improved Nemerow pollution index. The evaluation was based on metrics such as MAE and R². The findings indicated that the WA-ANN hybrid method outperformed the individual methods, as demonstrated by the higher accuracy achieved with the hybrid approach.

2.4. Adaptive Neuro-Fuzzy Inference System and Genetic Programming

The adaptive neuro-fuzzy inference system (ANFIS) and genetic programming (GP) are widely used hybrid AI techniques in groundwater quality management. The ANFIS method combines the strengths of fuzzy logic and neural networks, making it a powerful tool for modelling complex systems [38,131]. The GP method, on the other hand, is a search algorithm that uses natural selection and genetic operations to evolve a population of computer programs that can solve a particular problem [132].

One advantage of using the ANFIS-GP hybrid AI method in groundwater quality management is that it integrates different data types, including spatial and temporal data, which are essential for accurately predicting and managing groundwater quality [133]. Additionally, this method can handle missing data, which is common in groundwater quality management and can take noisy data [119,133]. Another advantage is that the ANFIS-GP hybrid AI method can accurately model complex systems, allowing for more efficient and adequate decision-making in groundwater quality management.

However, there are also some disadvantages to using the ANFIS-GP hybrid AI method. One such drawback is that the technique requires significant data to build an accurate model, which can sometimes be challenging [133,134]. Additionally, the ANFIS-GP method can be computationally intensive, leading to longer processing times and increased costs. Finally, the ANFIS-GP method can be difficult to interpret and understand, which can be a significant challenge for stakeholders and decision-makers who need to use the modelling process results [134].

2.5. Support Vector Machines (SVM) and Random Forest (RF)

SVM and RF are famous and influential machine-learning algorithms in various fields, including groundwater analysis. While they have their respective strengths and weaknesses, combining these two algorithms as a hybrid model can result in improved performance and more robust predictions [135]. The hybrid model that combines SVM and RF takes advantage of the strengths of both algorithms, resulting in a more accurate and stable model [136]. SVM can handle high-dimensional data and nonlinear relationships, while RF can identify important variables and handle missing values and noisy data [137,138]. Combining these algorithms allows the hybrid model to manage complex groundwater systems with many variables better and provide more reliable predictions. The hybrid model also addresses some disadvantages of SVM and RF, such as overfitting, sensitivity to hyperparameters, and difficulty in interpretation [116,139]. The hybrid model can provide better generalisation and performance on new data by using RF to select important variables and SVM to build a more accurate model with reduced dimensions.

Additionally, the hybrid model can measure uncertainty and confidence in the model’s predictions. Thus, combining SVM and RF as a hybrid model can result in improved performance and more robust predictions for groundwater analysis. While the specific implementation of the hybrid model depends on the dataset and research question, researchers should consider the advantages of each algorithm and the potential benefits of combining them to develop a more accurate and reliable model.

2.6. Artificial Neural Networks (ANN) and Kriging

ANN and Kriging are two common methods used in groundwater analysis. ANN is a machine learning algorithm inspired by the structure and function of biological neural networks, which can predict groundwater levels, flow rates, or other hydrogeological parameters based on input data, such as precipitation, temperature, and soil properties. Kriging is a geostatistical method used for the spatial interpolation of data. It involves using statistical models to estimate the values of unsampled points based on the importance of nearby sampled points. In groundwater analysis, Kriging can be used to interpolate groundwater levels or flow rates from a limited number of monitoring wells to create a spatially continuous prediction [140,141].

A hybrid model that combines ANN and Kriging can take advantage of the strengths of both algorithms, capturing the complex spatial relationships between groundwater parameters and improving the accuracy of the predictions by incorporating spatial correlation information, which results in a robust and stable solution to groundwater prediction problems [142,143]. Researchers can use this hybrid model to predict groundwater levels, flow rates, or other hydrogeological parameters based on input data and incorporate spatial correlation information to create a more accurate and reliable prediction [141,144,145].

For instance, a study by Hosseini et al. [146] integrated ANN and Kriging to model and increase the efficiency of the groundwater-level monitoring networks. The results showed that the hybrid approach had a higher accuracy of up to 78% in predicting the spatial distribution of hydraulic heads than either ANN or Kriging alone. Another study by Moasheriet al. [147] used the ANN-Kriging hybrid model to predict groundwater quality parameters in a Kashan area. The study reported that the hybrid model provides more accurate results (up to 11%) than the geostatistical method in Kriging. However, it’s important to note that these results are specific to the study areas and dataset used and may need to be generalisable to other areas. However, the improvement in accuracy or time, the results can vary depending on the specific study area, the size and complexity of the dataset, and the specific algorithms used for ANN and Kriging.

However, Machine learning algorithms, such as deep neural networks, genetic algorithms, and decision tree algorithms, can also be integrated with other physical models, such as fracture flow models, multiphase flow models, analytical models, finite element models, finite difference models, and geostatistical models like kriging interpolation, as discussed above, to create hybrid models [148].

One example of a hybrid model is combining a finite element model with an ANN and ANFIS to simulate spatiotemporal groundwater levels [149]. This hybrid model was shown wavelet-based de-noised data enhanced the performance of the modelling by up to 14%. Another study proposed a hybrid model that combines a multiphase flow model with a machine-learning algorithm to simulate groundwater contamination [150]. The hybrid model was more accurate and efficient than the traditional multiphase flow model by 13%.

Additionally, a study proposed a hybrid model that combines a finite element model with an SVM algorithm for groundwater anomaly detection [151]. The hybrid model was shown to improve the accuracy of the simulation and reduce the computational cost.

2.7. Genetic Algorithm (GA) and Decision Tree (DT)

GA and DT are popular machine-learning algorithms in various fields, including groundwater analysis. GA is a search heuristic that is inspired by the process of natural selection and genetics. It is used to find the optimal solution to a problem by exploring an ample search space [135]. GA can be used to optimise parameters for groundwater models, such as determining the best values for hydraulic conductivity or recharge rates. GA generates a population of potential solutions, selects the fittest solutions, and uses genetic operators such as mutation and crossover to create new solutions. The process continues until an optimal solution is found.

DT, conversely, is a decision-making algorithm that builds a tree-like model of decisions and their possible consequences [152]. DT is a supervised learning algorithm used for classification and regression tasks. In groundwater analysis, DT can predict groundwater quality, identify contaminated sites, or classify different groundwater types based on hydrogeochemical characteristics. DT works by recursively partitioning the data into subsets based on the values of other features and then creating decision nodes to predict the target variable based on the partitioned data. However, GA’s slow computation speed and complex gene encoding/decoding processes pose challenges for groundwater researchers, particularly in dealing with complex problems. A hybrid model that combines GA and DT can take advantage of the strengths of both algorithms. GA can optimise the parameters of the DT model to improve its performance and accuracy. For example, GA can be used to determine the best features to include in the DT model or to optimise the hyperparameters of the DT algorithm.

Additionally, GA can be used to reduce the size of the dataset and eliminate irrelevant features, which can improve the efficiency and accuracy of the DT model [153,154]. Compared to single AI methods, the GA-DT hybrid approach can reduce the computational time required to optimise the model parameters. However, the time improvement can also depend on the size and complexity of the dataset. Therefore, GA and DT are two powerful machine-learning algorithms that can be used for groundwater analysis. A hybrid model that combines these algorithms can take advantage of their respective strengths and improve the performance and accuracy of the model. Researchers can use this hybrid model to predict groundwater quality, identify contaminated sites, or classify different groundwater types depending on their research question and dataset.

2.8. Deep Belief Networks (DBN) and Support Vector Regression (SVR)

DBN is a type of artificial neural network that is used for unsupervised learning. It comprises multiple layers of hidden units that learn to represent the input data hierarchically. DBN can extract complex features from large datasets, which can be used for other machine-learning tasks such as classification or regression. In groundwater analysis, DBN can be used to analyse large datasets of hydrogeological parameters and extract meaningful information to understand the groundwater system better [155]. SVR, on the other hand, is a regression algorithm that is used to predict continuous variables. It finds a hyperplane in a high-dimensional space that maximally separates the input data points. SVR can predict groundwater levels, flow rates, or other continuous variables important for groundwater management [156,157]. SVR is beneficial when dealing with nonlinear and complex relationships between variables.

A hybrid model combined with DBN and SVR can take advantage of the strengths of both algorithms. DBN can extract complex features from large datasets and reduce the dimensionality of the input data, which can then be used as input to the SVR algorithm, resulting in better performance and accuracy of the predictions and helping overcome some of the limitations of each algorithm [157].

Where DBN can suffer from overfitting if the number of hidden units is too large, or SVR can be sensitive to the selection of hyperparameters, the hybrid model can provide a more robust and stable solution to groundwater prediction problems, including groundwater levels, flow rates, or other important continuous variables for groundwater management [142,155,158]. In a study [157], researchers proposed the innovative DBN-SVR method, which accurately predicts water quality parameters and outperforms models such as SVR and DBN. This method significantly improves (up to 85%) performance indicators such as MAE, MAPE, RMSE, and R² compared to DBN and achieves a high fitting effect and surpasses BP. However, the combined model takes longer; it provides the best prediction accuracy. Thus, the determination coefficient indicates that hybrid AI is superior to BP, SVR, and DBN in predicting water quality, with better accuracy and robustness.

2.9. Particle Swarm Optimisation (PSO) and Support Vector Regression (SVR)

PSO and SVR are popular methods for groundwater quality management. PSO is a global optimisation algorithm that can hold discrete and continuous variables [159,160], while SVR is a regression capability. When used together as a hybrid AI system, PSO and SVR can overcome the limitations of each method and improve the accuracy of groundwater quality predictions [161]. The advantages of using POS-SVR in groundwater quantity management include its ability to handle nonlinear relationships and model complex systems and its ability to provide accurate predictions even when the available data is limited or incomplete. Additionally, POS-SVR can optimise the pumping schedule for groundwater extraction, resulting in significant cost savings and helping preserve the groundwater resource. However, POS-SVR also has some disadvantages, such as the need for high computational resources to optimise the model and the potential for overfitting the data. Despite these limitations, POS-SVR is a promising tool for groundwater quantity management. It could improve our understanding of groundwater systems and inform decision-making for sustainable management of groundwater resources.

2.10. Rough Set Theory (RST) and Support Vector Machines (SVM)

Groundwater quality management is a complex and challenging task that requires the integration of various data sources and decision-making tools. One promising approach is hybrid artificial intelligence methods, such as RST and SVMs. RST is a mathematical approach that can handle uncertain and incomplete data by defining lower and upper approximations of sets [162]. SVMs, on the other hand, are a type of supervised learning algorithm that can classify data by finding the optimal hyperplane that separates different classes [162,163]. Combining these two approaches can lead to a powerful and robust decision-making tool for groundwater quality management. The advantages of combining rough set theory and SVMs as a hybrid AI method in groundwater quality management include their ability to handle complex and high-dimensional datasets, their flexibility in dealing with uncertain and incomplete data, and their capability to provide accurate and reliable predictions [145,164,165]. However, there are also some disadvantages, such as the potential for overfitting, the requirement for a large amount of training data, and difficulty interpreting the results.

3. Less Commonly Hybrid AI Models

In groundwater sciences, hybrid AI models that combine different machine learning techniques have become increasingly popular for various purposes, such as predicting groundwater levels, mapping groundwater, and assessing quality. A table has been compiled, which lists 15 such models, including Hybrid Decision Tree (HDT) and Genetic Algorithm (GA), Self-Organizing Map (SOM) and Decision Tree (DT), Neural Network (NN) and Principal Component Analysis (PCA), Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM), and Artificial Neural Networks (ANN) and Markov Chain Monte Carlo (MCMC). These models offer several advantages over single methods. For example, HDT-GA can improve the accuracy and interpretability of decision trees by selecting the best attributes for splitting [166]. SOM-DT can be used to cluster data into classes, and then the decision tree can be applied to each cluster separately [167]. NN-PCA can reduce the dimensionality of data and increase the efficiency of neural network training [168]. CNN-LSTM can capture spatial and temporal data features, making them suitable for image and speech recognition tasks [169]. ANN-MCMC can be used to estimate the posterior distribution of parameters in neural networks, allowing uncertainty to be incorporated into predictions [170].

Despite their potential benefits, these hybrid models have yet to be fully explored, and their limitations and applicability must be thoroughly investigated. The less popular hybrid methods, such as HDT-GA, SOM-DT, NN-PCA, CNN-LSTM, and ANN-MCMC, have received less attention in the literature. Developing and implementing these hybrid models can be challenging, requiring significant expertise in different areas of machine learning. Additionally, the effectiveness of these methods can depend highly on the specific problem being addressed and the quality of the data available. Furthermore, there is often a need for more understanding of how these methods work and how to interpret their results, making it challenging to apply them in practice. Nonetheless, using hybrid AI models in groundwater sciences is a rapidly evolving field, with researchers continuously developing new models and algorithms to manage groundwater resources more effectively. There is, therefore, immense potential for developing new hybrid AI models to advance our understanding and management of groundwater resources.

This entry is adapted from the peer-reviewed paper 10.3390/w15091750

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.