In recent decades, air pollution has been a serious environmental issue, and several developed and developing countries have suffered from heavy air pollution [
1]. The identification of atypical pollution in the quantified concentrations of these compounds has been a significant problem for health [
2]. Compared to other pollution, air pollution has a direct impact on people’s health, and the major causes of air pollution are natural disasters, residential heating, exhaust from industries and factories, and the burning of fossil fuels [
3,
4]. Therefore, predicting the mass concentrations of air pollution is essential and plays a crucial role in atmospheric management decisions [
5]. Additionally, existing epidemiological research studies state that PM
2.5 causes negative human health effects, like respiratory diseases and cardiovascular diseases [
6,
7]. Therefore, effective forecasting of air pollutant concentrations strengthen the prevention of air pollution, which helps in achieving efficient environmental management [
8]. In addition, it has great significance for government decision making and people’s health [
9]. Poor AQP not only affects the human physical condition but also produces a key impact on societal and economic controls [
10].
2. Effective Air Quality Prediction Using Reinforced Swarm Optimization and Bi-Directional Gated Recurrent Unit
For predicting the air quality in Tripoli [
14], Esager and Ünlü proposed an evaluation of deep learning models for hourly PM
2.5 surface mass concentrations. Since the analyzed data are a time series, the Box–Jenkins methodology is generally used to model such a dataset. This study gave particular attention to the LSTM and GRU with CNN types of recurrent neural networks. The result analysis demonstrates the strong forecasting power of the used algorithms. This type of model’s key benefit is that it does not call for the same exact assumptions that other traditional models do. These algorithms were also quite effective in simulating the data’s nonlinear behavior.
Du et al. [
15] implemented a hybrid deep learning architecture for effective air pollution forecasting. The implemented hybrid architecture, Bi-LSTM and a convolutional neural network (CNN), learns multivariate, temporal, and spatial correlation features from the collected time series data for effective forecasting of air quality. The experiments conducted on the two real-world datasets demonstrated that the implemented hybrid architecture was effective in dealing with PM
2.5 air pollution prediction with better accuracy. The integration of deep learning models increased the time complexity and computational cost because it required an enormous amount of data to obtain satisfactory results.
Usually, the dynamics of air pollution is reflected by dissimilar factors, like rainfall, snowfall, wind speed, wind direction, humidity, and temperature. These factors increase the difficulty in understanding the changes that occurred in the air pollutant concentration. Tao et al. [
16] integrated the CNN and Bi-GRU models for effective forecasting of air pollution. The experiments conducted on the UCI machine-learning repository Beijing PM
2.5 dataset demonstrated the effectiveness of the hybrid deep learning models, as they achieved better results than traditional models. As mentioned earlier, the integration of two deep learning models leads to high time complexity.
Ma et al. [
17] used a Bi-LSTM network with transfer learning for forecasting air pollution in Anhui, China. The numerical results showed that the Bi-LSTM network with transfer learning achieved a 35% lower error rate than the existing models on a real-time dataset. The developed Bi-LSTM network with transfer learning was not scalable and was time-consuming while performing experiments on a real-time dataset.
Chang et al. [
18] implemented a new aggregated LSTM network for effective air pollution forecasting. The aggregated LSTM network combines information about external pollution sources, stations nearby industrial areas, and the stations with local air pollution monitoring systems. Here, three LSTM models were aggregated in order to improve prediction accuracy, but it was a computationally complex process.
Castelli et al. [
19] employed a machine learning technique called support vector regression (SVR) for forecasting air quality index (AQI) and pollutant levels. After the acquisition of time series data, data preprocessing (data transformation, outlier removal, and imputation of missing data) and feature engineering were accomplished. Finally, the air pollution prediction was carried out by utilizing the SVR technique. However, the SVR will underperform when the number of feature vectors for every data point exceeds the number of training samples.
Xayasouk et al. [
20] integrated a deep autoencoder and an LSTM network for air pollution prediction. In addition to this, Wen et al. [
21] combined a CNN and an LSTM network for effective forecasting of air pollution in China. Wang et al. [
22] implemented a two-layer air pollution prediction model based on a GRU and an LSTM network. The numerical outcomes confirmed that the presented hybrid models obtained higher prediction performance than existing ones at different regional scales. The hybrid deep learning model has the ability to handle complex and large data, but it was computationally expensive.
Air pollution is becoming a serious problem due to the rapid growth of industrialization. In the present scenario, predicting air pollution is crucial in determining prevention measures for avoiding disasters. Zhang et al. [
23] utilized a light gradient boosting technique for selecting discriminative features from real-time datasets. Further, the selected 500 feature vectors were given to the eXtreme Gradient Boosting (XGBoost) technique for air pollution forecasting.
Wang et al. [
24] initially adopted the Hampel identifier and variational mode decomposition (VMD) technique for detecting and eliminating outliers from the acquired datasets. Then, the optimal feature vectors were selected from the denoised data by employing a sine-cosine algorithm, and finally, an extreme learning machine (ELM) was implemented for accurate forecasting of air pollution. Generally, standard machine learning techniques, such as XGBoost and ELM, exhibit outliers and overfitting problems when analyzing complex time series data.
The PM of the Turkish city Ankara was modeled using a hybrid deep learning methodology, which was analyzed by Akbal and Ünlü [
25]. According to the WHO’s criteria, PM levels were categorized to provide a prediction problem. Further, by using the ensemble machine learning methodology of random forest regression (RFR), extra tree regression (ETR), and multiple linear regression (MLR), the impact of various contaminants and meteorological variables on the prediction of PM has been examined. The findings indicated that other substances, the Earth’s surface temperature, wind speed, and PM’s own lagged values were the most crucial predictor variables for PM.
Li et al. [
26] employed the Hampel filter and least square support vector machine (SVM) regression for AQI forecasting. Maleki et al. [
27] implemented an artificial neural network (ANN) for air pollution forecasting. However, the ANN was a simpler deep learning mode and required more training data to obtain satisfactory results. Mao et al. [
28] implemented a temporal sliding LSTM network for effective prediction of air quality. The presented temporal sliding LSTM network achieved higher prediction results with strong atmospheric decision making.
Zhang et al. [
29] integrated empirical mode decomposition (EMD) and a Bi-LSTM network for effective forecasting of AQI. Firstly, the EMD technique was employed for decomposing PM
2.5 time series data and extracting the amplitude and frequency features. Secondly, the obtained features were given to the Bi-LSTM network for AQI forecasting. The experiments conducted on the PM
2.5 and Beijing hourly datasets demonstrated the efficacy of the developed EMD-Bi-LSTM model by means of error rate. In the time series analysis, the Bi-LSTM network was slower and consumed more time for model training.
Zeinalnezhad et al. [
30] integrated an adaptive neuro-fuzzy inference system (ANFIS) and semi-experimental nonlinear regression for predicting the concentration of important pollutants. However, the standard ANFIS models include a few problems, such as the curse of dimensionality, high computational expense, and loss of data interpretability.
Aarthi et al. [
31] initially used a Min-Max normalization technique for filling in the missing attributes in the collected dataset, and then, the optimal attributes were selected from the preprocessed data by implementing a balanced spider monkey optimization (BSMO) algorithm. Based on the balancing factor, the BSMO algorithm selects the relevant attributes, which are given to the Bi-LSTM network for AQP. The developed BSMO algorithm efficiently finds the optimal solution but has a poor convergence rate.