Soft Computing Applications in Air Quality Modeling: Comparison
Please note this is a comparison between Version 1 by Muhammad Muhitur Rahman and Version 3 by Rita Xu.

Air quality models simulate the atmospheric environment systems and provide increased domain knowledge and reliable forecasting. They provide early warnings to the population and reduce the number of measuring stations. Due to the complexity and non-linear behavior associated with air quality data, soft computing models became popular in air quality modeling (AQM). This study critically investigates, analyses, and summarizes the existing soft computing modeling approaches. Among the many soft computing techniques in AQM, this article reviews and discusses artificial neural network (ANN), support vector machine (SVM), evolutionary ANN and SVM, the fuzzy logic model, neuro-fuzzy systems, the deep learning model, ensemble, and other hybrid models. Besides, it sheds light on employed input variables, data processing approaches, and targeted objective functions during modeling. The discussion in this paper will help to determine the suitability and appropriateness of a particular model for a specific modeling context.

Air quality models simulate the atmospheric environment systems and provide increased

domain knowledge and reliable forecasting. They provide early warnings to the population and reduce

the number of measuring stations. Due to the complexity and non-linear behavior associated with

air quality data, soft computing models became popular in air quality modeling (AQM). This study

critically investigates, analyses, and summarizes the existing soft computing modeling approaches.

Among the many soft computing techniques in AQM, this article reviews and discusses artificial

neural network (ANN), support vector machine (SVM), evolutionary ANN and SVM, the fuzzy logic

model, neuro-fuzzy systems, the deep learning model, ensemble, and other hybrid models. Besides,

it sheds light on employed input variables, data processing approaches, and targeted objective

functions during modeling. The discussion in this paper will help to determine the suitability and

appropriateness of a particular model for a specific modeling context.

  • Adaptive neuro-fuzzy inference system
  • artificial neural networks
  • air quality model
  • deep learning
  • ensemble model
  • evolutionary techniques
  • fuzzy logic model
  • soft computing model
  • support vector machine

1. Potential Soft Computing Models and Approaches

Among many potential techniques, different variations of artificial neural networks, evolutionary fuzzy and neuro-fuzzy models, ensemble and hybrid models, and knowledge-based models should be further explored. Besides, there is a continuous need for the development of a universal model, as most of the explored models are either site-dependent or pollutant dependent. This section discusses future research directions and potential soft computing models that can be investigated in air quality modeling throughout the world.

Among many potential techniques, different variations of artificial neural networks, evolutionary fuzzy and neuro-fuzzy models, ensemble and hybrid models, and knowledge-based models should be further explored. Besides, there is a continuous need for the development of a universal model, as most of the explored models are either site-dependent or pollutant dependent. This section discusses future research directions and potential soft computing models that can be investigated in air quality modeling throughout the world.

1.1. Variations of ANN Models

As can be observed from Section 3, ANN approaches were widely explored in AQM and in most cases MLP-NN, BP-NN, RBF-NN, or R-NN were employed. Other available variations of ANN (GR-NN, GC-NN, P-NN, W-NN, and others) models that successfully demonstrated their capabilities in modeling complex and non-linear problems in other engineering fields have not been explored significantly [1]. Many of them (extreme learning machine, multitasking, probabilistic, time delay, modular, and other hybrid neural networks) are rarely explored. Besides, deep neural network models received great attention in modeling PM

As can be observed from Section 3, ANN approaches were widely explored in AQM and in most cases MLP-NN, BP-NN, RBF-NN, or R-NN were employed. Other available variations of ANN (GR-NN, GC-NN, P-NN, W-NN, and others) models that successfully demonstrated their capabilities in modeling complex and non-linear problems in other engineering fields have not been explored significantly [11]. Many of them (extreme learning machine, multitasking, probabilistic, time delay, modular, and other hybrid neural networks) are rarely explored. Besides, deep neural network models received great attention in modeling PM

2.5 concentrations, but other air pollutants have not been modeled significantly. Therefore, such unexplored and rarely explored variations of the neural networks can be investigated in future works for modeling all types of air pollutant concentrations.

concentrations, but other air pollutants have not been modeled significantly. Therefore, such unexplored and rarely explored variations of the neural networks can be investigated in future works for modeling all types of air pollutant concentrations.

1.2. Evolutionary Fuzzy and Neuro-Fuzzy Models

Fuzzy systems are the proven tools for many applications for modeling complex and non-linear problems. However, the lack of learning capabilities in the fuzzy systems has encouraged researchers to augment their capabilities by hybridizing them with the EO techniques [2][3][4][4]. Among the many EO techniques, GA, GWO, CSA, SCA, and PSO are widely used and well-known global search optimization approaches with the ability to explore a large search space for suitable solutions [5][6][7]. Besides, the type-2 fuzzy set is capable of handling more uncertainties than the type-1 fuzzy set that has been successfully applied in a wide range of areas [8][9]. Therefore, considering the potentiality of the fuzzy logic approaches, these can be explored in the field of AQM.

Fuzzy systems are the proven tools for many applications for modeling complex and non-linear problems. However, the lack of learning capabilities in the fuzzy systems has encouraged researchers to augment their capabilities by hybridizing them with the EO techniques [241,242,243,244]. Among the many EO techniques, GA, GWO, CSA, SCA, and PSO are widely used and well-known global search optimization approaches with the ability to explore a large search space for suitable solutions [245,246,247]. Besides, the type-2 fuzzy set is capable of handling more uncertainties than the type-1 fuzzy set that has been successfully applied in a wide range of areas [248,249,250]. Therefore, considering the potentiality of the fuzzy logic approaches, these can be explored in the field of AQM.

1.3. Group Method Data Handling Models and Functional Network Models

Long-term research in the field of neural networks and advanced statistical methods has contributed to the evolution of an abductory induction mechanism that is known as GMDH [10]. It automatically synthesizes abductive networks from a database of inputs and outputs with complex and nonlinear relationships. Other extensions of the neural network models include the functional network models (FNM) [11]. This determines the structure of a network and data using domain knowledge and estimates unknown neuron functions. Both GMDH and FNM were explored in many relevant applications [12][13][14]. These rarely explored extensions of the neural networks can be further investigated in AQM.

Long-term research in the field of neural networks and advanced statistical methods has contributed to the evolution of an abductory induction mechanism that is known as GMDH [251]. It automatically synthesizes abductive networks from a database of inputs and outputs with complex and nonlinear relationships. Other extensions of the neural network models include the functional network models (FNM) [252]. This determines the structure of a network and data using domain knowledge and estimates unknown neuron functions. Both GMDH and FNM were explored in many relevant applications [253,254,255]. These rarely explored extensions of the neural networks can be further investigated in AQM.

1.4. Case-Based Reasoning and Knowledge-Based Models

Case-based reasoning solves new problems by recalling the experiences and solutions of similar past problems [15]. It deals with the given problems following four steps, namely retrieve, reuse, revise, and retain [16]. Another soft computing technique, the knowledge-based system, attempts to solve problems by giving advice in a domain and utilizing the knowledge provided by a human expert [17]. Researchers have employed both techniques to solve many complex problems [18][19][20]. These techniques can be investigated in AQM, as none of them have yet been explored.

Case-based reasoning solves new problems by recalling the experiences and solutions of similar past problems [256]. It deals with the given problems following four steps, namely retrieve, reuse, revise, and retain [257]. Another soft computing technique, the knowledge-based system, attempts to solve problems by giving advice in a domain and utilizing the knowledge provided by a human expert [258]. Researchers have employed both techniques to solve many complex problems [259,260,261]. These techniques can be investigated in AQM, as none of them have yet been explored.

1.5. Ensemble and Hybrid Models

As discussed earlier, ensemble models employ multiple learning techniques in parallel and combine their outputs to produce a better generalization performance. In a real-world situation, they aim to manage the strengths and weaknesses of each model and end up with the best possible solutions [21]. Recently, such models received huge momentum in modeling AQM, but this was limited to a few specific pollutants (mainly PM

As discussed earlier, ensemble models employ multiple learning techniques in parallel and combine their outputs to produce a better generalization performance. In a real-world situation, they aim to manage the strengths and weaknesses of each model and end up with the best possible solutions [262]. Recently, such models received huge momentum in modeling AQM, but this was limited to a few specific pollutants (mainly PM

2.5). Researchers should invest more time into these attractive tools as they will become some of the most prominent tools for AQM in the future.

). Researchers should invest more time into these attractive tools as they will become some of the most prominent tools for AQM in the future.

1.6. Development of Universal Models

Most of the discussed models are either site dependent or pollutant dependent. There is no guarantee that a specific model developed for a specific site will be stable and reliable for another location with different meteorological conditions. Therefore, there is always a need for the development of a universal model for AQM. Besides, the comparison between the site-specific models could be an attractive option for future research as it aids in developing site characterizations. Such research may enable the creation of guidelines for site-specific model development.

Most of the discussed models are either site dependent or pollutant dependent. There is no guarantee that a specific model developed for a specific site will be stable and reliable for another location with different meteorological conditions. Therefore, there is always a need for the development of a universal model for AQM. Besides, the comparison between the site-specific models could be an attractive option for future research as it aids in developing site characterizations. Such research may enable the creation of guidelines for site-specific model development.

1.7. Appropriate Input Selection Methods

As discussed in Section 2, several approaches have been reported to reduce the input space by selecting the most dominant input variables. In addition, most of the approaches selected air pollutant and meteorological data as inputs. A few of the considered other types of data, including temporal, traffic, geographical, and sustainable data. Therefore, the present authors believe that the comparison of such input selection methods considering all available input data types could be an attractive field of research in AQM. Besides, the selection of proper decomposition components for the reduction of data dimensionality could be considered as another potential research direction, as the inclusion of many components in input space may result in model complexity and the accumulation of errors. Moreover, other available data pre-processing and feature extraction techniques employed for relevant fields could also be explored.

As discussed in Section 2, several approaches have been reported to reduce the input space by selecting the most dominant input variables. In addition, most of the approaches selected air pollutant and meteorological data as inputs. A few of the considered other types of data, including temporal, traffic, geographical, and sustainable data. Therefore, the present authors believe that the comparison of such input selection methods considering all available input data types could be an attractive field of research in AQM. Besides, the selection of proper decomposition components for the reduction of data dimensionality could be considered as another potential research direction, as the inclusion of many components in input space may result in model complexity and the accumulation of errors. Moreover, other available data pre-processing and feature extraction techniques employed for relevant fields could also be explored.

2. Conclusions

Soft computing models have become very popular in air quality modeling as they can efficiently model the complexity and non-linearity associated with air quality data. This article critically reviewed and discussed existing soft computing modeling approaches. Among the many available soft computing techniques, the artificial neural networks with variations of structures and the hybrid modeling approaches combining several techniques were widely explored in predicting air pollutant concentrations throughout the world. Other approaches, including support vector machines, evolutionary artificial neural networks and support vector machines, fuzzy logic, and neuro-fuzzy systems, have also been used in air quality modeling for several years. Recently, deep learning and ensemble models have received huge momentum in modeling air pollutant concentrations due to their wide range of advantages over other available techniques. Additionally, this research reviewed and listed all possible input variables for air quality modeling. It also discussed several input selection processes, including cross-correlation analysis, principal component analysis, random forest, learning vector quantization, rough set theory, and wavelet decomposition techniques. Besides, this article sheds light on several data recovery approaches for missing data, including linear interpolation, multivariate imputation by chained equations, and expectation-maximization imputation methods.

Soft computing models have become very popular in air quality modeling as they can efficiently model the complexity and non-linearity associated with air quality data. This article critically reviewed and discussed existing soft computing modeling approaches. Among the many available soft computing techniques, the artificial neural networks with variations of structures and the hybrid modeling approaches combining several techniques were widely explored in predicting air pollutant concentrations throughout the world. Other approaches, including support vector machines, evolutionary artificial neural networks and support vector machines, fuzzy logic, and neuro-fuzzy systems, have also been used in air quality modeling for several years. Recently, deep learning and ensemble models have received huge momentum in modeling air pollutant concentrations due to their wide range of advantages over other available techniques. Additionally, this research reviewed and listed all possible input variables for air quality modeling. It also discussed several input selection processes, including cross-correlation analysis, principal component analysis, random forest, learning vector quantization, rough set theory, and wavelet decomposition techniques. Besides, this article sheds light on several data recovery approaches for missing data, including linear interpolation, multivariate imputation by chained equations, and expectation-maximization imputation methods.
 
Finally, it proposed many advanced, reliable, and self-organizing soft computing models that are rarely explored and/or not explored in the field of air quality modeling. For instance, functional neural network models, variations of neural network models, evolutionary fuzzy and neuro-fuzzy systems, type-2 fuzzy logic models, group method data handling, case-based reasoning, ensemble, and hybrid models, and knowledge-based systems have the immense potential for modeling air pollutant concentrations. Moreover, the modelers can compare the effectiveness of several input selection processes to find the most suitable one for air quality modeling. Furthermore, they can attempt to build universal models instead of developing site-specific and pollutant-specific models. The authors believe that the findings of this review article will help researchers and decision-makers in determining the suitability and appropriateness of a particular model for a specific modeling context.
 
 

Finally, it proposed many advanced, reliable, and self-organizing soft computing models that are rarely explored and/or not explored in the field of air quality modeling. For instance, functional neural network models, variations of neural network models, evolutionary fuzzy and neuro-fuzzy systems, type-2 fuzzy logic models, group method data handling, case-based reasoning, ensemble, and hybrid models, and knowledge-based systems have the immense potential for modeling air pollutant concentrations. Moreover, the modelers can compare the effectiveness of several input selection processes to find the most suitable one for air quality modeling. Furthermore, they can attempt to build universal models instead of developing site-specific and pollutant-specific models. The authors believe that the findings of this review article will help researchers and decision-makers in determining the suitability and appropriateness of a particular model for a specific modeling context.

  • Herrera, F.; Lozano, M.; Verdegay, J.L. Dynamic and heuristic fuzzy connectives-based crossover operators for controlling the diversity and convergence of real-coded genetic algorithms. Int. J. Intell. Syst. 1998, 11, 1013–1040. [Google Scholar] [CrossRef]
  • Cordón, O.; Gomide, F.; Herrera, F.; Hoffmann, F.; Magdalena, L. Ten years of genetic fuzzy systems: Current framework and new trends. Fuzzy Sets Syst. 2004, 141, 5–31. [Google Scholar] [CrossRef]
  • Hassan, M.R.; Arafat, S.M.; Begg, R.K. Fuzzy-Genetic Model for the Identification of Falls Risk Gait. In Proceedings of the Procedia Computer Science; Elsevier B.V.: Amsterdam, The Netherlands, 2016; Volume 82, pp. 4–11. [Google Scholar]
  • Chatterjee, A.; Chatterjee, R.; Matsuno, F.; Endo, T. Neuro-fuzzy state modeling of flexible robotic arm employing dynamically varying cognitive and social component based PSO. Meas. J. Int. Meas. Confed. 2007, 40, 628–643. [Google Scholar] [CrossRef]
  • Dolgopolov, P.; Konstantinov, D.; Rybalchenko, L.; Muhitovs, R. Optimization of train routes based on neuro-fuzzy modeling and genetic algorithms. In Proceedings of the Procedia Computer Science; Elsevier B.V.: Amsterdam, The Netherlands, 2019; Volume 149, pp. 11–18. [Google Scholar]
  • Ashish, K.; Dasari, A.; Chattopadhyay, S.; Hui, N.B. Genetic-neuro-fuzzy system for grading depression. Appl. Comput. Informatics 2018, 14, 98–105. [Google Scholar] [CrossRef]
  • Douiri, M.R. Particle swarm optimized neuro-fuzzy system for photovoltaic power forecasting model. Sol. Energy 2019, 91–104. [Google Scholar] [CrossRef]
  • Karnik, N.N.; Mendel, J.M. Applications of type-2 fuzzy logic systems: Handling the uncertainty associated with surveys. In Proceedings of the FUZZ-IEEE’99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315); IEEE: Piscataway, NJ, USA, 1999; Volume 3, pp. 1546–1551. [Google Scholar]
  • Shafaei Bajestani, N.; Vahidian Kamyad, A.; Nasli Esfahani, E.; Zare, A. Prediction of retinopathy in diabetic patients using type-2 fuzzy regression model. Eur. J. Oper. Res. 2018, 264, 859–869. [Google Scholar] [CrossRef]
  • Sharifian, A.; Ghadi, M.J.; Ghavidel, S.; Li, L.; Zhang, J. A new method based on Type-2 fuzzy neural network for accurate wind power forecasting under uncertain data. Renew. Energy 2018, 120, 220–230. [Google Scholar] [CrossRef]
  • Barron, A. Predicted squared error: A criterion for automatic model selection. In Proceedings of the Self-Organizing Methods in Modeling; Marcel Dekker: New York, NY, USA, 1984; pp. 87–103. [Google Scholar]
  • Castillo, E. Functional Networks. Neural Process. Lett. 1998, 7, 151–159. [Google Scholar] [CrossRef]
  • Zhou, G.; Zhou, Y.; Huang, H.; Tang, Z. Functional networks and applications: A survey. Neurocomputing 2019, 335, 384–399. [Google Scholar] [CrossRef]
  • Wu, J.; Wang, Y.; Zhang, X.; Chen, Z. A novel state of health estimation method of Li-ion battery using group method of data handling. J. Power Sources 2016, 327, 457–464. [Google Scholar] [CrossRef]
  • Liu, H.; Duan, Z.; Wu, H.; Li, Y.; Dong, S. Wind speed forecasting models based on data decomposition, feature selection and group method of data handling network. Measurement 2019, 148, 106971. [Google Scholar] [CrossRef]
  • Kolodner, J.L. An introduction to case-based reasoning. Artif. Intell. Rev. 1992, 6, 3–34. [Google Scholar] [CrossRef]
  • Aamodt, A.; Plaza, E. Case-based reasoning: Foundational issues, methodological variations, and system approaches. J. AI Commun. 1994, 7, 39–59. [Google Scholar] [CrossRef]
  • Meyer, M.D.; Watson, L.S.; Walton, M.; Skinner, R.E. Artificial Intelligence in Transportation: Information for Application; National Research Council: Washington, DC, USA, 2007. [Google Scholar]
  • Abutair, H.Y.A.; Belghith, A. Using Case-Based Reasoning for Phishing Detection. In Proceedings of the Procedia Computer Science; Elsevier B.V.: Amsterdam, The Netherlands, 2017; Volume 109, pp. 281–288. [Google Scholar]
  • Raza, B.; Kumar, Y.J.; Malik, A.K.; Anjum, A.; Faheem, M. Performance prediction and adaptation for database management system workload using Case-Based Reasoning approach. Inf. Syst. 2018, 76, 46–58. [Google Scholar] [CrossRef]
  • Blondet, G.; Le Duigou, J.; Boudaoud, N. A knowledge-based system for numerical design of experiments processes in mechanical engineering. Expert Syst. Appl. 2019, 122, 289–302. [Google Scholar] [CrossRef]
  • Sammut, C.; Webb, G.I. Encyclopedia of Machine Learning and Data Mining, 2nd ed.; Springer: New York, NY, USA, 2017; ISBN 9781489976857. [Google Scholar]