Rainfall Prediction System: History
Please note this is an old version of this entry, which may differ significantly from the current revision.

Rainfall prediction is one of the challenging tasks in weather forecasting process. Accurate rainfall prediction is now more difficult than before due to the extreme climate variations.

  • rainfall
  • rainfall prediction
  • machine learning

1. Introduction

Knowledge extraction from time series data has become a widely explored research area [1,2]. Data which are collected with time stamps in a specific pattern are called time series data [3,4,5]. This type of time-oriented data is collected with a specific time interval, such as on an hourly, daily, or weekly basis. Time series data can be utilized effectively to make predictions in various areas and domains, including foreign currency rates, stock market trends, energy consumption estimations, and climate change. Machine learning and data mining techniques can be utilized to extract the hidden patterns from historical data in order to forecast the future trend [1,2,5,6]. Weather forecasting on the basis of historical data is a complex but very beneficial task [7] which comes with several problems that need to be solved in order to achieve optimal results. Weather-related data consists of various attributes or features such as temperature, pressure, humidity, and wind speed. Machine learning techniques tend to predict future weather conditions by using hidden patterns and relations among the features of historical weather data [2]. Precipitation prediction is one of the crucial stages of the weather forecasting process. A smart city is a place where all the community elements, including people and devices, are connected with advanced technologies. In these urban areas, data are collected from citizens as well as from buildings through sensors and electronic devices; the data is then used to manage resources, services, and assets effectively and efficiently. In such technologically advanced cities, data are processed, analyzed, and then used to monitor and manage various systems and activities; as such, data are considered to be very important. The data collected from different sources in smart cities are ultimately used in various automatic systems, including traffic and transportation systems, water supply networks, power plants, waste collection and disposal systems, crime detection systems, education systems, and other community services. The use of machine learning and artificial intelligence techniques is considered to be a crucial element in the services and products of smart cities. Weather forecasting is necessary for the citizens of smart cities so that people can plan their activities according to the predicted weather. In particular, accurate and timely rainfall prediction in smart cities can be quite helpful for arranging planning and security measures in advance for flight operations, agricultural tasks, water reservoir systems, and constructions and transportation activities [2,8,9]. A red alert in advance in the case of extreme rainfall can save the citizens of smart cities from potentially life-threatening situations.

2. Literature Review of Rainfall Prediction System

Improving the accuracy of machine learning techniques on weather forecasting has been the primary concern of many researchers over the last two decades. Some of the related studies are discussed here. In [18], researchers presented an ANN-based technique to predict atmospheric conditions. The dataset used for prediction consisted of various weather attributes, e.g., humidity, temperature, and wind speed. The proposed technique integrated the Back Propagation Network and Hopfield Network in such a way that the output of BPN is given to the HN as input. This technique works by exploring the non-linear relationship between historical weather attributes. In [19], researchers used ANN to predict the monthly average rainfall of monsoon weather in India. A dataset covering a period of 8 months each year was used for prediction. The selected months were considered to have a high probability rainfall. Three types of different networks were used for performance analysis: Feed Forward Back Propagation, Layer Recurrent, and Cascaded Feed Forward Back Propagation. According to the results, Feed Forward Back Propagation outperformed the others. In [20], researchers proposed a rainfall prediction technique which used genetic algorithms for feature selection and Naïve Bayes as a predictive algorithm. The proposed solution had two steps: the first step deals with the prediction of rainfall (whether it will rain or not), and the second step classifies the rainfall as light, moderate, or strong. In [21], researchers presented a framework consisting of deep neural networks to predict weather changes over the next 24 h. For prediction, they used a dataset covering 30 years, from 1983 to 2012, obtained from Hong Kong Observatory (HKO). The dataset consisted of four weather attributes: temperature, dew point, mean sea level pressure, and wind speed. According to the results, DNNs provided a good feature space for weather datasets. In [22], researchers presented a new pre-processing technique by using moving average and singular spectrum analysis. The proposed approach can be applied on the classes of training data in order to transform it into low, medium, and high categories. Prediction was performed using an Artificial Neural Network (ANN). Two daily rainfall datasets—Zhenshui and Da’ninghe water sheds in China—were used for experiment.
In [23], researchers proposed a hybrid method for rainfall forecasting by integrating feature extraction and prediction techniques. The dataset used for the experiment was obtained from the National Oceanic and Atmospheric Administration (NOAA); it spanned more than 50 years and consisted of various weather features such as humidity, pressure, temperature, and wind speed. A Neural Network was used to classify the instances into low, medium, and high classes based on a pre-defined training set. In [24], researchers presented a data-intensive model for rainfall prediction using a Bayesian modeling approach. For the experiment, the dataset was collected from the Indian Meteorological Department, and from 36 attributes, the 7 most relevant attributes were selected. Before the prediction, pre-processing and transformation steps were performed for smooth processing. The proposed approach showed good accuracy for rainfall prediction, using moderate computing resources compared to meteorological centers using high-performance computing power for weather predictions. In [25], researchers compared different machine learning techniques for the prediction of rainfall in Malaysia. The mining techniques included Naïve Bayes, Neural Network, SVM, Decision Tree, and Random Forest. Pre-processing was performed on the dataset to fill the missing values and to remove the noise before classification. Random Forest outperformed the others; it correctly classified a large number of instances with a small portion of training data. In [26], the technique of Clusterwise Linear Regression was employed, which involved integrating the clustering and regression methods. The proposed CLR technique predicted the monthly rainfall in Victoria, Australia. The used dataset was obtained from eight geographically diverse weather stations, spanning from 1889 to 2014. The performance was compared with other published techniques; it was shown that in most of the locations, CLR performed better than others. In [27], researchers compared “Markov Chain extended with rainfall prediction” with other widely used data mining techniques, including Radial Basis, Neural Networks, Genetic Programming, Support Vector Regression, M5 Rules, k-Nearest Neighbors, and M5 Model trees. A dataset obtained from 42 cities was used for the experiment. The results showed that the Markov Chain technique can be outperformed by machine learning techniques. The correlation between weather-related attributes and accuracy has also been noted.
In [28], two forecasting models were developed for rainfall prediction: the first predicted for 1 month ahead, whilst the second predicted for 2 months ahead by using ANN. A dataset from several locations of north India was used for the experiment. The model integrated the Feed Forward Neural Network with Back Propagation technique, along with the Levenberg–Marquardt training function. The performance was analyzed in terms of Mean Square Error and Magnitude of Relative Error. According to the results, the 1-month ahead forecasting model outperformed the 2-month model. In [29], researchers proposed a framework named the Wavelet Neural Network (WNN) to predict the rainfall. The proposed solution integrated ANN with the wavelet technique. Both models (ANN and WNN) were used for prediction by using rainfall historical data from the Darjeeling rain gauge station, situated in West Bengal, India. According to the results, WNN outperformed ANN. In [30], researchers presented an SVM-based application for the prediction of weather. A time series dataset related to the past n days from a location was analyzed, and then the maximum temperature of that location for the next day was predicted. By using optimal values of the kernel function, the performance of the proposed application was evaluated and found to outperform Multi-Layer Perceptron (MLP), trained with a back-propagation algorithm. To train the SVM, a nonlinear regression method was found to be suitable. In [31], researchers presented an advanced statistical technique for solar power forecasting based on an artificial intelligence approach. The proposed technique requires several features as input, such as past power measurements and meteorologically related forecasts. The required metrological data included solar irradiance, relative humidity, and temperature. A SOM (Self organized map) was trained to classify the local weather 24 h in advance with the help of online meteorological services. The proposed method was considered to be suitable for the forecasting of 24 h ahead power output of a PV (photovoltaic) system, as well as for trading in electricity markets of PV power system operators.
In [32], researchers presented the technique of modular-based Support Vector Machine (SVM) to predict and simulate rainfall prediction. The proposed technique consisted of several steps, such as the generation of training sets with the bagging sampling technique, training of SVM kernel function, selection of SVM combination members with the PLS (Partial Least Square) technique, and production of ν–SVM. The proposed technique was used for monthly rainfall prediction in Guangxi, China and outperformed other models.
Table 1 summarizes the previously published related work. Previously, most researchers used supervised machine learning classifiers in order to predict rainfall by exploring hidden patterns in historical data. The researchers mostly used more than one technique in the proposed frameworks: one for feature selection and one for classification and prediction. Rainfall forecasting using time series weather data has also been widely explored by researchers. This research proposes a framework for rainfall prediction, particularly for smart cities, where real-time weather data is continuously collected from specific weather sensors. Moreover, to increase the performance, the predictive accuracy of four classifiers (DT, NB, KNN, and SVM) is integrated with the help of fuzzy logic.
Table 1. Summary of previous related work.
Reference Method Dataset Dataset Duration Accuracy %
D. Gupta et al. [6] ANN-based classification model, with 10 hidden layers Public 18 years 82.1
D. Gupta et al. [6] Classification and Regression Tree-based Prediction Public 18 years 80.3
D. Gupta et al. [6] K nearest neighbor-based prediction, with k = 22 Public 18 years 80.7
J. Joseph et al. [23] ANN-based hybrid technique, integrating classification and clustering techniques Private 4 months 87
V.B. Nikam et al. [24] Feature selection-based Bayesian classification model Public 6 months 91
N. Prasad et al. [33] Decision Tree-based supervised learning in quest (SLIQ) Public 14 years 72.3

This entry is adapted from the peer-reviewed paper 10.3390/s22093504

This entry is offline, you can click here to edit this entry!
Video Production Service