Network Traffic Analysis: Comparison
Please note this is a comparison between Version 1 by Adimoolam Malaiyalathan and Version 2 by Sirius Huang.

Network data traffic is increasing with expanded networks for various applications, with text, image, audio, and video for inevitable needs. Network traffic pattern identification and analysis of traffic of data content are essential for different needs and different scenarios. Many approaches have been followed, both before and after the introduction of machine and deep learning algorithms as intelligence computation. The network traffic analysis is the process of incarcerating traffic of a network and observing it deeply to predict what the manifestation in traffic of the network is. To enhance the quality of service (QoS) of a network, it is important to estimate the network traffic and analyze its accuracy and precision, as well as the false positive and negative rates, with suitable algorithms. 

  • machine learning
  • deep learning
  • network traffic
  • traffic prediction
  • reinforcement learning
  • internet traffic

1. Introduction

Internet data traffic has been enormously burst out, due to the introduction of big data capable data, along with the invention of speed network components. The resources of such a network are essential, and it have to be utilized for the intended purpose; additionally, it is a very challenging task to monitor and predict data traffic for various reasons. For data traffic prediction, manual processing-based prediction and artificial intelligence-based methods and techniques have already been deployed. Some of the methods and techniques are as follows. (i) The prediction of daily internet traffic using a data mining technique for smart university application [1]; (ii) A low complexity-based boost machine learning algorithm with classification and regression to predict internet data traffic from weak learning to strong learning [2]; (iii) A double exponential predictor [3], based on artificial neural network (ANN), classic time series, and wavelet transform-based predictors; (iv) A deep learning-based prediction [4] for metropolitan area network traffic; (v) A neural network ensemble [5] for internet traffic forecasting.
Network traffic prediction [6] is inevitable, due to the cost of bandwidth, time complexity measurement, data prediction, suspicious traffic identification, and so on. The traffic of network impact is directly proportional to the bandwidth and life span of network and multi-user identification. A recurrent neural network (RNN) was adopted to find network traffic for proactive network management and planning [7]. The RNN prediction formula is as shown in Equation (1).
 
y i = w i h i + y i 1
where yi is the predicted value at time iwi is weight of input, hi is hidden layer state at time i, and yi−1 is predicted value at time i − 1 [8]. The GEANT backbone network was tested with the network structure of RNN and optical network parameters with 200 epochs. Yet another work was discussed regarding four different algorithms for predicting network traffic [9]. Those algorithms were RNN, deep learning stacked auto-encoder, multilayer perceptron (MLP), and MLP with back-propagation. Aspects such as adaptive application, bandwidth detection, congestion control, and anomaly detection and admission control network traffic and have been discussed with time series internet traffic prediction.
The application of network traffic analysis and prediction consist of bandwidth monitoring, data analysis, efficient network management for intended users, and so on. Little research has been carried out to measure the bandwidth efficiency. Work regarding bandwidth utilization and forecasting model was discussed for bandwidth utilization with ARIMA and SNMP setups [10]. The computational time was measured, and it was achieved at 83.2%, along with forecast error and standard deviation.
Yet another application-oriented network traffic prediction was performed for measuring the accuracy and timely internet traffic information [4]. The proposed mechanism has detected regarding network traffic for anomaly detection, admission control, bandwidth allocation, and congestion control with big traffic data and deep architecture model-based internet traffic flow prediction. The novelty has been achieved with special and temporal correlations, as well as the glow data character approach. The training data set was trained in the greedy layer-wise fashion. The dataset is taken from China Unicom for network work utilization. Yet another work surveyed real world network traffic prediction with various machine learning algorithms with a cognitive approach [11]. Here, the applications are coined based on their classifications for threat category, regression for value prediction, and ranking for ordering traffic. The learning algorithms discussed here were neural network, linear time series models, principal component analysis (PCA), linear regression (LR), statistical model, and support vector machine (SVM) for either long-term or short-term predictions. The application’s performance measures have been taken as data availability and system complexity for both local and wide-area networks. From discussed techniques, the applications supported were cellular traffic, optical networks, LTEs network, IP networks, TCP traffic, MPEG, JPEG traffic, Ethernet traffic, and many more.

2. The Applications of Network Traffic Analysis

The major applications of network traffic prediction are network management, resource allocation, quality of service (QoS) from the internet service provider (ISP), cyberspace security protections, and malware detection [12]. The ISCX and QUIC public dataset was used here to measure the performance of traffic, with a proposed method called multi-task learning framework. Yet another work was related to the online application of current internet performance measures, as determined by analyzing encrypted packet, virtual private network, and non-VPN traffic using the proposed method, referred to as deep packet. This work had taken file transfer protocol (FTP) and peer-to-peer (P2P) network traffic. The recall performance was measured for the UNB ISCX VPN and non-VPN dataset [13]. The network classifier approach was deployed in hyper-text transfer protocol (HTTP) and session initiation protocol (SIP) [14]. Performance measures such as duration, latency, and traffic volume were measured using RNN and CNN learning algorithms.
In the past decade, more than 40 research works have been introduced that discussed network traffic analysis with manual traffic prediction, machine learning-based network traffic prediction, and deep learning-based traffic prediction. Most of the work used machine and deep learning algorithms to predict network traffic. A work was introduced to predict network traffic using a time series approach with recurrent neural network (RNN) [7]. Also, the variation of RNNs were analyzed using past network traffic dataset. Its performance was measured using the GEANT research and educational network. An experiment was conducted with 200 epochs with a learning rate of 0.01 to 0.5. The performance of variant long short-term memory (LSTM) was better than other variants of RNN.
For network traffic, most cited articles were related to deep learning and machine learning traffic identification algorithms. One work coined a suitable lightweight framework with a deep learning algorithm. These frameworks have penetrated the encrypted traffic, classified the deep full range, and detected the intrusion with two datasets [15]. Yet another application, i.e., the user activity monitoring-based network traffic, was developed using the machine learning algorithm [16]. K-mean and random forest (RF) algorithms have been used to measure the network traffic QoS, accuracy, and real time traffic generated with time bound. A network management-based traffic classification with software defined network (SDN) was tested with a CNN and stacked auto encoder (SAE). This proposed work was used for online traffic service. Recall, accuracy, and precision were measured with a deep learning algorithm [17]. The CNN prediction formula is given in Equation (2), and it is a neuron calculation for traffic.
 
y i = b i + i = 1 n w i × x i
where yi is the neuron calculation, wi is the weight matrix of input, xi is input, and bi is the bias of the neuron.
Video Production Service