Prediction of Cellular Network Traffic

Prediction of Cellular Network Traffic: Comparison

Please note this is a comparison between Version 1 by Yudong Zhang and Version 2 by Wendy Huang.

Cellular communication systems have continued to develop in the direction of intelligence. The demand for cellular networks is increasing as they meet the public’s pursuit of a better life. Accurate prediction of cellular network traffic can help operators avoid wasting resources and improve management efficiency. Traditional prediction methods can no longer perfectly cope with the highly complex spatiotemporal relationships of the current cellular networks, and prediction methods based on deep learning are constantly growing.

cellular network
traffic prediction
deep learning
Internet
time series data
machine learning

1. Introduction

With the explosive development of the Internet and the widespread use of mobile terminals, total Internet access has grown exponentially. Cellular network communications have appeared in all aspects of people’s lives, brought great convenience, and met the need for a better material life and spiritual life [1]. What follows is that a huge amount of cellular network traffic data will be generated by users all the time [2]. Operators are caught off guard by the countless amounts of cellular network traffic data and rising user demand. Accurate cellular network traffic prediction can help mobile operators predict overall network usage and make appropriate resource allocations based on this prediction, helping them improve network resource utilization and avoid resource waste [3]. In addition, accurate cellular network traffic prediction can improve the user experience and enable operators to dynamically adjust the base station usage in hotspot areas and times of day to avoid network congestion, so that users can enjoy the most appropriate services anytime, anywhere ^[4][5][4,5].

Therefore, cellular traffic prediction is an indispensable research field, but it is a challenging problem [6]. For now, the main difficulty of cellular network traffic prediction is the complex spatiotemporal characteristics. Cellular network traffic data are a kind of temporal sequential data, but the mobility of cellular network users is constantly convenient and fast, which makes the traffic values in different regions have cross-space characteristics, which increases the complexity of data characteristics. Secondly, traditional forecasting methods can only deal with simple temporal data and cannot deal with high complexity data with concurrent temporal and spatial characteristics. In addition, traditional methods also need to satisfy fixed assumptions, but in real life, highly complex data does not fit the ideal assumptions. In this case, a novel algorithm is needed to deal with highly complex data and make sure the two characteristics do not interfere with each other.

2. Cellular Network Traffic Prediction Method

Network traffic data are essentially time series data [7]. Therefore, in the initial stage, the network prediction problem was only built on the time axis. The forecasting methods only predict the future change trend according to the past traffic change law and use the differences between traffic values at different moments as a quantification of traffic growth potential [8]. For example, Historical Averaging (HA) takes the average of the flow value in the past period as the flow value in the next moment. Linear Prediction [9], the Auto Regression Integrated Moving Average Model (ARIMA) model ^[10][11][10,11], and the Seasonal Auto Regression Integrated Moving Average Model (SARIMA) ^[12][13][12,13] are classical forecasting methods that usually have complete theoretical support and a good performance in time series with obvious periodicity and smoothness. However, with the strengthening of end-user mobility, network traffic data also has a certain degree of regular distribution in space, the traditional linear time series methods only analyze the characteristics of traffic data on the time axis, which can no longer obtain accurate prediction results. Artificial intelligence methods can extract various characteristics of the dataset well. The deep learning methods belonging to artificial intelligence have strong learning ability, which can extract the most useful characteristics from the complex dataset and learn the regularity and randomness of the data. For nonlinear time series data, deep learning can learn the temporal and spatial characteristics well and obtain accurate prediction results. Therefore, the prediction methods based on deep learning have become a hot spot in the research of cellular network traffic prediction. In recent years, machine learning/deep learning methods have received widespread attention because of good fitting ability, inference ability, and expression ability. They have been applied to various fields [14]. Many researchers also use machine learning/deep learning as technical tools to study cellular network traffic prediction problems. Among this, Zhang, L. et al. [15] proposed a model based on Support Vector Machine (SVM) for multi-step traffic flow prediction. SVM can find the optimal hyperplane after finding the data regularity, but there is no general solution to the nonlinear problem, and sometimes it is difficult to find a suitable kernel function. Awan, F. M. et al. [16] used noise pollution data to make smart city traffic prediction based on LSTM. Fu, R. et al. [17] used LSTM and gated recurrent unit (GRU) for short-term traffic flow prediction. LSTM and GRU can obtain the time series characteristics of traffic data well but did not consider the impact of other factors on cellular network traffic data. Zhang, C. et al. [18] proposed a neural network model STDenseNet. It used densely connected convolutional neural networks (CNN) [19] to design two networks with shared structure, one for learning temporal proximity, and one for learning temporal periodicity. Daoud, N. et al. [20] used LSTM, CNN-LSTM, and ConvLSTM, three neural network models, to predict the global dust zone aerosol optical depth (AOD). Since CNN has an excellent performance capturing spatial characteristics, STDenseNet was also learning spatial characteristics while learning temporal characteristics. However, CNN is more suitable for extracting spatial characteristics than temporal characteristics, while LSTM is more suitable for extracting temporal characteristics. CNN is limited to characterizing grid-based traffic data ^[21][22][21,22]. The CNN for traffic prediction is more suitable for Euclidean structures and cannot be easily applied to non-Euclidean structures, but, in reality, most of the traffic data belong to non-Euclidean data. Guo, K. et al. [23] proposed a graph convolutional recurrent neural network model named OGCRNN for traffic prediction, using the Graph Convolution Gated Recurrent Unit (GCGRU) [24] to learn the spatial-temporal characteristics of traffic data. GCN can be used for non-Euclidean structures, which is more realistic. Yu, B. et al. [25] proposed a neural network model named STGCN. It constructed traffic networks on graphs and proposed spatial-temporal GCN that used two graph convolution algorithms, Chebyshev Polynomials Approximation ^[26][27][26,27] and 1st-order Approximation [28], to process the spatial characteristics of traffic networks. GRU was used in temporal layers to process the temporal characteristics of the traffic network, and the accuracy is significantly improved. The spatial-temporal module in STGCN was a “sandwich” structure, where two temporal layers are sandwiched by a spatial layer. The input of the spatial layer is the output of the temporal layer plus the adjacency matrix, but the output generated by the temporal layer cannot fully represent the pattern of the original transportation network data, and the original data pattern may change through complex temporal layer operations. OGCRNN [23] and STGCN [25] were used for spatial-temporal characteristics of traffic data, but not for spatial-temporal characteristics of cellular data [29].