Soft Computing Applications in Air Quality Modeling: Comparison
Please note this is a comparison between Version 2 by Rita Xu and Version 3 by Rita Xu.

Air quality models simulate the atmospheric environment systems and provide increased domain knowledge and reliable forecasting. They provide early warnings to the population and reduce the number of measuring stations. Due to the complexity and non-linear behavior associated with air quality data, soft computing models became popular in air quality modeling (AQM). This study critically investigates, analyses, and summarizes the existing soft computing modeling approaches. Among the many soft computing techniques in AQM, this article reviews and discusses artificial neural network (ANN), support vector machine (SVM), evolutionary ANN and SVM, the fuzzy logic model, neuro-fuzzy systems, the deep learning model, ensemble, and other hybrid models. Besides, it sheds light on employed input variables, data processing approaches, and targeted objective functions during modeling. The discussion in this paper will help to determine the suitability and appropriateness of a particular model for a specific modeling context.

  • Adaptive neuro-fuzzy inference system
  • artificial neural networks
  • air quality model
  • deep learning
  • ensemble model
  • evolutionary techniques
  • fuzzy logic model
  • soft computing model
  • support vector machine

1. Potential Soft Computing Models and Approaches

Among many potential techniques, different variations of artificial neural networks, evolutionary fuzzy and neuro-fuzzy models, ensemble and hybrid models, and knowledge-based models should be further explored. Besides, there is a continuous need for the development of a universal model, as most of the explored models are either site-dependent or pollutant dependent. This section discusses future research directions and potential soft computing models that can be investigated in air quality modeling throughout the world.

1.1. Variations of ANN Models

As can be observed from Section 3, ANN approaches were widely explored in AQM and in most cases MLP-NN, BP-NN, RBF-NN, or R-NN were employed. Other available variations of ANN (GR-NN, GC-NN, P-NN, W-NN, and others) models that successfully demonstrated their capabilities in modeling complex and non-linear problems in other engineering fields have not been explored significantly [1]. Many of them (extreme learning machine, multitasking, probabilistic, time delay, modular, and other hybrid neural networks) are rarely explored. Besides, deep neural network models received great attention in modeling PM2.5 concentrations, but other air pollutants have not been modeled significantly. Therefore, such unexplored and rarely explored variations of the neural networks can be investigated in future works for modeling all types of air pollutant concentrations.

1.2. Evolutionary Fuzzy and Neuro-Fuzzy Models

Fuzzy systems are the proven tools for many applications for modeling complex and non-linear problems. However, the lack of learning capabilities in the fuzzy systems has encouraged researchers to augment their capabilities by hybridizing them with the EO techniques [2][3][4][4]. Among the many EO techniques, GA, GWO, CSA, SCA, and PSO are widely used and well-known global search optimization approaches with the ability to explore a large search space for suitable solutions [5][6][7]. Besides, the type-2 fuzzy set is capable of handling more uncertainties than the type-1 fuzzy set that has been successfully applied in a wide range of areas [8][9]. Therefore, considering the potentiality of the fuzzy logic approaches, these can be explored in the field of AQM.

1.3. Group Method Data Handling Models and Functional Network Models

Long-term research in the field of neural networks and advanced statistical methods has contributed to the evolution of an abductory induction mechanism that is known as GMDH [10]. It automatically synthesizes abductive networks from a database of inputs and outputs with complex and nonlinear relationships. Other extensions of the neural network models include the functional network models (FNM) [11]. This determines the structure of a network and data using domain knowledge and estimates unknown neuron functions. Both GMDH and FNM were explored in many relevant applications [12][13][14]. These rarely explored extensions of the neural networks can be further investigated in AQM.

1.4. Case-Based Reasoning and Knowledge-Based Models

Case-based reasoning solves new problems by recalling the experiences and solutions of similar past problems [15]. It deals with the given problems following four steps, namely retrieve, reuse, revise, and retain [16]. Another soft computing technique, the knowledge-based system, attempts to solve problems by giving advice in a domain and utilizing the knowledge provided by a human expert [17]. Researchers have employed both techniques to solve many complex problems [18][19][20]. These techniques can be investigated in AQM, as none of them have yet been explored.

1.5. Ensemble and Hybrid Models

As discussed earlier, ensemble models employ multiple learning techniques in parallel and combine their outputs to produce a better generalization performance. In a real-world situation, they aim to manage the strengths and weaknesses of each model and end up with the best possible solutions [21]. Recently, such models received huge momentum in modeling AQM, but this was limited to a few specific pollutants (mainly PM2.5). Researchers should invest more time into these attractive tools as they will become some of the most prominent tools for AQM in the future.

1.6. Development of Universal Models

Most of the discussed models are either site dependent or pollutant dependent. There is no guarantee that a specific model developed for a specific site will be stable and reliable for another location with different meteorological conditions. Therefore, there is always a need for the development of a universal model for AQM. Besides, the comparison between the site-specific models could be an attractive option for future research as it aids in developing site characterizations. Such research may enable the creation of guidelines for site-specific model development.

1.7. Appropriate Input Selection Methods

As discussed in Section 2, several approaches have been reported to reduce the input space by selecting the most dominant input variables. In addition, most of the approaches selected air pollutant and meteorological data as inputs. A few of the considered other types of data, including temporal, traffic, geographical, and sustainable data. Therefore, the present authors believe that the comparison of such input selection methods considering all available input data types could be an attractive field of research in AQM. Besides, the selection of proper decomposition components for the reduction of data dimensionality could be considered as another potential research direction, as the inclusion of many components in input space may result in model complexity and the accumulation of errors. Moreover, other available data pre-processing and feature extraction techniques employed for relevant fields could also be explored.

2. Conclusions

Soft computing models have become very popular in air quality modeling as they can efficiently model the complexity and non-linearity associated with air quality data. This article critically reviewed and discussed existing soft computing modeling approaches. Among the many available soft computing techniques, the artificial neural networks with variations of structures and the hybrid modeling approaches combining several techniques were widely explored in predicting air pollutant concentrations throughout the world. Other approaches, including support vector machines, evolutionary artificial neural networks and support vector machines, fuzzy logic, and neuro-fuzzy systems, have also been used in air quality modeling for several years. Recently, deep learning and ensemble models have received huge momentum in modeling air pollutant concentrations due to their wide range of advantages over other available techniques. Additionally, this research reviewed and listed all possible input variables for air quality modeling. It also discussed several input selection processes, including cross-correlation analysis, principal component analysis, random forest, learning vector quantization, rough set theory, and wavelet decomposition techniques. Besides, this article sheds light on several data recovery approaches for missing data, including linear interpolation, multivariate imputation by chained equations, and expectation-maximization imputation methods.

Finally, it proposed many advanced, reliable, and self-organizing soft computing models that are rarely explored and/or not explored in the field of air quality modeling. For instance, functional neural network models, variations of neural network models, evolutionary fuzzy and neuro-fuzzy systems, type-2 fuzzy logic models, group method data handling, case-based reasoning, ensemble, and hybrid models, and knowledge-based systems have the immense potential for modeling air pollutant concentrations. Moreover, the modelers can compare the effectiveness of several input selection processes to find the most suitable one for air quality modeling. Furthermore, they can attempt to build universal models instead of developing site-specific and pollutant-specific models. The authors believe that the findings of this review article will help researchers and decision-makers in determining the suitability and appropriateness of a particular model for a specific modeling context.


  1. Sheen Mclean Cabaneros; John Kaiser Calautit; Ben Richard Hughes; A review of artificial neural network models for ambient air pollution prediction. Environmental Modelling & Software 2019, 119, 285-304, 10.1016/j.envsoft.2019.06.014.
  2. Francisco Herrera; M. Lozano; J. L. Verdegay; Dynamic and heuristic fuzzy connectives-based crossover operators for controlling the diversity and convergence of real-coded genetic algorithms. International Journal of Intelligent Systems 1998, 11, 1013-1040, 10.1002/(sici)1098-111x(199612)11:12<1013::aid-int1>;2-k.
  3. O. Cordón; F. Gomide; Enrique Herrera-Viedma; F. Hoffmann; Luis Magdalena; Ten years of genetic fuzzy systems: current framework and new trends. Fuzzy Sets and Systems 2004, 141, 5-31, 10.1016/s0165-0114(03)00111-8.
  4. Dolgopolov, P.; Konstantinov, D.; Rybalchenko, L.; Muhitovs, R. Optimization of train routes based on neuro-fuzzy modeling and genetic algorithms. In Proceedings of the Procedia Computer Science; Elsevier B.V.: Amsterdam, The Netherlands, 2019; Volume 149, pp. 11–18.
  5. Kumar Ashish; Anish Dasari; Subhagata Chattopadhyay; Nirmal Baran Hui; Genetic-neuro-fuzzy system for grading depression. Applied Computing and Informatics 2018, 14, 98-105, 10.1016/j.aci.2017.05.005.
  6. Moulay Rachid Douiri; Particle swarm optimized neuro-fuzzy system for photovoltaic power forecasting model. Solar Energy 2019, 184, 91-104, 10.1016/j.solener.2019.03.098.
  7. Karnik, N.N.; Mendel, J.M. Applications of type-2 fuzzy logic systems: Handling the uncertainty associated with surveys. In Proceedings of the FUZZ-IEEE’99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315); IEEE: Piscataway, NJ, USA, 1999; Volume 3, pp. 1546–1551.
  8. Narges Shafaei Bajestani; Ali Vahidian Kamyad; Ensieh Nasli Esfahani; Assef Zare; Prediction of retinopathy in diabetic patients using type-2 fuzzy regression model. European Journal of Operational Research 2018, 264, 859-869, 10.1016/j.ejor.2017.07.046.
  9. Amir Sharifian; M. Jabbari Ghadi; Sahand Ghavidel; Li Li; Jiangfeng Zhang; A new method based on Type-2 fuzzy neural network for accurate wind power forecasting under uncertain data. Renewable Energy 2018, 120, 220-230, 10.1016/j.renene.2017.12.023.
  10. Barron, A. Predicted squared error: A criterion for automatic model selection. In Proceedings of the Self-Organizing Methods in Modeling; Marcel Dekker: New York, NY, USA, 1984; pp. 87–103.
  11. Castillo, E; Functional Networks. Neural Process. Lett. 1998, 7, 151–159.
  12. Guo Zhou; Yongquan Zhou; Huajuan Huang; Zhonghua Tang; Functional networks and applications: A survey. Neurocomputing 2019, 335, 384-399, 10.1016/j.neucom.2018.04.085.
  13. Ji Wu; Yujie Wang; Xu Zhang; Zonghai Chen; A novel state of health estimation method of Li-ion battery using group method of data handling. Journal of Power Sources 2016, 327, 457-464, 10.1016/j.jpowsour.2016.07.065.
  14. Hui Liu; Zhu Duan; Haiping Wu; Yanfei Li; Siyuan Dong; Wind speed forecasting models based on data decomposition, feature selection and group method of data handling network. Measurement 2019, 148, 106971, 10.1016/j.measurement.2019.106971.
  15. Janet Kolodner; An introduction to case-based reasoning. Artificial Intelligence Review 1992, 6, 3-34, 10.1007/bf00155578.
  16. Agnar Aamodt; Enric Plaza; Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. AI Communications 1994, 7, 39-59, 10.3233/aic-1994-7104.
  17. Meyer, M.D.; Watson, L.S.; Walton, M.; Skinner, R.E. Artificial Intelligence in Transportation: Information for Application; National Research Council: Washington, DC, USA, 2007.
  18. Abutair, H.Y.A.; Belghith, A. Using Case-Based Reasoning for Phishing Detection. In Proceedings of the Procedia Computer Science; Elsevier B.V.: Amsterdam, The Netherlands, 2017; Volume 109, pp. 281–288.
  19. Basit Raza; Yogan Jaya Kumar; Ahmad Kamran Malik; Adeel Anjum; Muhammad Faheem; Performance prediction and adaptation for database management system workload using Case-Based Reasoning approach. Information Systems 2018, 76, 46-58, 10.1016/
  20. Gaëtan Blondet; Julien Le Duigou; Nassim Boudaoud; A knowledge-based system for numerical design of experiments processes in mechanical engineering. Expert Systems with Applications 2019, 122, 289-302, 10.1016/j.eswa.2019.01.013.
  21. Sammut, C.; Webb, G.I. Encyclopedia of Machine Learning and Data Mining, 2nd ed.; Springer: New York, NY, USA, 2017; ISBN 9781489976857.