Data Plane
-
In this part, all network devices are immersed of collector agents. Centralized collector records send flow and agent samples. Specific flow metrics are collected by the device configurations, and then they are exported to the collector. Currently, major vendors such as Cisco offer export support and built-in flow collection.
Records of network flow are collected by the data collector embedded in the module of the control plane. Then, the control plane filters the data and conducts feature extraction. Thus, different datasets are created and generated by the collector, which are crucial for the ML approach
[26]. The OpenFlow controller should be communicating with all network devices known as data sources.
The constructed and implemented model of ML is used as an application of SDN. Various methods of ML using generated datasets of different flow collectors can be used as applications of SDN for different purposes. The operation of a network can be influenced by constructing different applications powered by the ML-based model. Applications of incident handling include selection of path and rule enforcement. In the following case, the ML model used is an application of SDN-based IDS.
It is predicted that network flows can be classified as being normal or malicious.
3.1.1. DoS, U2R, Probe, and R2L
Four types of attacks are addressed in the studies discussed in the following section: DoS, U2R, probe, and R2L. The NSL-KDD dataset used between all of these include the common characteristics and attacks classified in the four classes.
The D-NN was used by the authors in
[27] to detect six features based on anomalies and are suitable for SDN: type of protocol, count, duration, srv count, src bytes, and dst bytes. The model was trained and tested by the authors, and compared with other processes, such as SVM, NB, multi-layer perceptron, J48, random tree (RT), and random forest (RF) methods. The research explore the applications of DL used in a detection system known as flow-based anomaly detection. At the same time, the authors claimed that the development of machine learning is not fully completed. Ref.
[28] presented a discussion of nine classifiers based on ML with a supervised learning technique. Different tests were performed on accuracy, recall, execution time, false alarm rate, area under curve of ROC, f1-measure, McNemar’s test, and precision. PCA was used in the tests to reduce dimensions with DT, K-NN, LDA, NN, linear SVM, extreme learning machine (ELM), BaggingTrees, NB, RUSBoost, RF, AdaBoost, and LogitBoost. The results showed that performance of bagging and boosting techniques was higher than the other techniques. A subset of dataset features were selected as features in which content features were not included. A hybrid classification system of level-5 was used by the same authors in
[29] for IDS, in a network that was not based on SDN. Use of flow-statistics was the main aim of the research; these flow-statistics were provided by the controller for the development of NIDS. In the first level, k-NN was used as the classification method; for the second level, ELM was used. For the other levels, the hierarchical extreme learning machine (H-ELM) was used. One type of attack was detected by each level using the same preferred features from
[29]. For scalability purposes, the implemented system was in the form of a POX controller module, rather than an application plane function. The effectiveness of the method chosen to select features was because the features could be directly accessed from the controller. The results showed improved accuracy compared with the other methods.
3.1.2. DDoS Attacks
In the following section, DDoS attacks are specifically investigated in the presented studies for two reasons. First, DDoS attacks have been focused on large sections of IDS studies. Second, attacks should be individually considered with the perspective of recent threats, for example, the Mirai botnet and Internet of Things (IoT)
[30]. A specific application using SDN was presented by the authors in
[31] to tackle challenges in anomaly detection regarding scalability. The scenario was wireless SDN, which enabled the E-Health system. Massive machine-type communications were the main feature of such a network, where humans do not interact. For semi-supervised operations, CPLE was the ML technique used, with offline training. The main intention was online testing, so running localized detection was allowed within the devices. The requirement for frequent network traffic collection was avoided by using online testing to update the model of anomaly detection. The features used for classification were similar to the features defined in
[32]. An overview of IDS based on ML in SDN is provided by authors in
[28].
Millions of people may experience significant power outages if an attacker exploits cyber security weaknesses. This
entry addresses this problem by presenting an OPNET-based network model exposed to many DoS assaults, illustrating the cyber security features of IEC 61850-based digital substations
[33].
Five ML techniques were investigated by the study to mitigate DDoS attacks and intrusion (support vector machine, neural networks, Bayesian networks, genetic algorithms, decision tree, and fuzzy logic). Each method was theoretically analyzed by researchers and a comparison scheme was generated that presented the advantages and disadvantages of the approaches. The review could be used to choose the best technique, in accordance with system requirements. In
[34], the authors compared the SVM analysis in SDN with other approaches in defending against a DDoS attack. Threats to the controller regarding security and types of SDN-based DDoS attacks are briefly discussed in the research. Moreover, four methods of SVM and a description of system was provided in the research. For training and testing, the datasets used were the 1999 and 1998 DARPA, and a comparison of approach was done with bagging, RBF, random forest, J48, and naive Bayes methods. Highest accuracy was shown for the proposed SMV, at approximately 95%. The support vector classifier-based learning algorithm, where features are selected by using an ID3 decision tree, was used by authors in
[35]. The following three components, along with the software testbed, was used to evaluate the model: (1) Data collection was done using the sFlow Toolkit. (2) For the virtual switch, Open vSwitch was used. (3) The controller used was Ryu and the dataset was KDD-Cup 1999.
The model used in
[36] was the Dirichlet process mixture model, to mitigate DDoS attacks based on DNS. An owned dataset was used by the authors in
[3]; they created a dataset to generate DDoS attacks. The IDS system was presented by the authors in
[37] for the identification of DDoS attacks. They compared three methods: SVM with 97% accuracy, KNN BEST was 83% accurate, and naïve Bayes was approximately 83% accurate. Features considered as inputs included the number of packets, bandwidth, destination IP, protocol, source IP, and protocol. They used an owned dataset for testing. A proposal was presented by the authors of
[16] to improve resiliency by detecting some DDoS attacks, preferably the SYN flood attack, in the SDN network. Three different techniques were studied for classification: NB, DT, and SVM. DT showed 99% recall, precision, and accuracy. KDD 99 was the dataset used, with features including protocol, src port, src IP address, dst port, and dst IP address. Later, PCA was used by the researchers for reduction. DDoS attack detection and classification was done by researchers using an approach in the cloud environment
[38]. They used a two-stage learning scheme with two stages using two techniques: Bayesian and multivariate Gaussian. The employed features were blacklist IP, dst IP, number of packets, src IP, and spoof dst IP. Although complementary elements to the ML method were included in the study, however they did not directly secure the SDN. As an alternative, some steps were for the security of cloud infrastructure.
3.1.3. Comparison of Various Approaches in SDN
When considering a wide range of attacks regarding cyber security, it is important to have five attacks, including DDoS, U2R, DoS, probe, and R2L. Though SDN is an innovative paradigm, it is still subject to all known attack types. New adapted attacks should also be considered by the research community. Reviewing the adaption of approaches to SDN is essential for the recognition, prevention, and extenuation of attacks. In all traditional networks, the main point of applied ML approaches are to recognize attacks. ML uses miscellaneous techniques. Most of the studies reviewed investigated a single ML approach. One of two approaches from at least two methods was used in other research; the techniques were compared or combined to improve anomaly detection. Half of the reviewed research used neural networks (generic NN, CNN, RBM, ANN, and NEAT). SVM was another approach used in the reviewed research. The naïve Bayes method was also presented in several research. However, a set of six features suitable for SDN was presented by authors in
[39], which were further used in four studies. In different cases, independent selection of techniques of features are conducted, which have to be included in ML. In
[40], it is demonstrated that a network attack in SDN can be predicted with an accuracy of 91.68% using Bayesian Networks machine learning approach.
3.2. Deep Learning-Based IDS in SDN
There are several studies that have used DL-based IDS in SDN. Normally, seven different kinds of threat vectors exist in the SDN, of which three threat vectors are definite and linked with the controller application plane, controller data plane, and control plane. NIDS is broadly used for detecting intruders within the network by continuously monitoring any suspicious and malicious behavior in network traffic. Based on strategies for network attack detection, NIDS are mainly of two types. The first strategy compares network traffic with pre-defined intrusion samples, and this is called signature-based detection
[41]. New kinds of attack strategies cannot be detected by this kind of NID system; despite this fact, this technique is still very popular and commonly used in commercial IDS. In the second strategy, network traffic is compared with a normal user behavior model and any deviations from normal behavior of traffic is marked as an anomaly, using ML approaches. This type of NIDS is called anomaly-based detection. This technique can even detect attacks that have never seen before. Normally, flow-based monitoring of network traffic is combined with the latter NIDS strategy, i.e., anomaly-based detection
[42].
Flow-based network traffic monitoring relies upon packet header information, which is why it handles a lower amount of data compared with payload-based NIDs,. Applications of ML approaches are found in multiple zones of computer science, including speech recognition, face detection, and intrusion detection systems; however, such ML applications have faced some issues. In
[43], the authors discuss the various issues in which ML algorithm applications affect the NID system. Although deep learning research on NID systems that leverage SDN is in its pre-stages, it is gaining more attention among researchers due to its results. Until now, DL algorithms have been extensively used in different areas of computer science for image, face, and voice recognition and has had real success. Through DL algorithms, correlations in bulk amounts of raw data can be easily found, and due to this feature, it can be broadly used in the next generation of NID systems. With the help of DL-based NID systems, one can obtain high detection accuracy and even efficiently detect attacks that have never been seen. In another study
[44] authors investigate how DL might be useful for detecting malicious java script code.
There are many advantages exhibited by the SDN-based NID system using deep learning algorithms, including quality of service, virtual management, and security enforcement. SDN eliminates dependency on hardware because the whole network can be configured through programming and SDN enables a global view of the entire network, providing the chance to strengthen the network security
[45]. Different SDN-based NID systems using a deep learning algorithm are briefly overviewed and compared. Using emulation and simulation platforms, SDN can be developed with programmable features and software switch implementations. SDN can be easily implemented in both software and hardware environments with the help of protocol standards, i.e., OpenFlow
[46]. OMNeT++
[47], NS-3, Mininet
[48], and NS-2 are some other simulation tools used for SDN. In SDN, its control plane is considered its most vital part, and is also called the operating system of the entire network. The control plane of SDN is accountable for providing a global view of the entire network and communicate with all programmable network elements. Beacon, OpenDayLight
[49], Ryu, Floodlight
[12], POX, and NOX
[13] are different kinds of controllers.
The focus of recent studies have been on employing DL algorithms in NIDS, rather than ML algorithms. Through DL algorithms, correlations in bulk amounts of raw data can be easily found, and due to this feature, it could be broadly used in the next generation of NID systems. Compared with the results of various NID systems based on ML algorithms, DL-based NID systems gave much better results in the context of SDN
[14]. Most machine learning algorithms are trained in a supervised way and these can give good results in classification tasks, but not in logic modelling; deep learning-based algorithms outperformed ML algorithms in logic modelling. As attack behaviors consistently change, they introduce new types of attacks, and unsupervised learning approaches such as RNN, stacked deep auto-encoder, and hybrid approaches are the best options in detecting these attacks in SDN-based NID systems. Currently, researchers are focusing on SDN-based NIDS using DL algorithms for SOHO networks and their satisfactory results suggest that intrusion detection system accuracy has greatly improved because of SDN scalability and DL algorithms
[50].
3.2.1. DDoS Attack Detection Using DL Algorithms
Different kinds of vulnerabilities exist in the SDN platform, due to which the architecture of SDN is being targeted by various kinds of attacks such as a DDoS attack. In a DDoS attack, the intended user cannot get access to the network resources or machine. Multiple bots or multiple people are usually responsible for a DDoS attack, e.g., to launch a DDoS attack, the intruder can take advantage of SDN characteristics against the application plane, infrastructure plane, and controller of SDN. In the SDN environment, it is very easy to deploy a DDoS attack; preventing such an attack is very difficult
[51][52]. The occurrence of DDoS attacks in SDN is increasing daily, since the advent of the internet. The major reason behind the increased occurrence of DDoS attacks is the development and emergence of botnets that are formed within a network by machines or bots when they are exploited with malware. According to
[53], the increase in DDoS attacks in 2016 was 125.36% of those that occurred in 2015. In
[54], a lightweight detection system for DDoS attacks based on SOM in SDN was suggested, with a 6-tuple feature extraction: growth of different ports and single-flows, percentage of air-flows, average of duration per flow and bytes per flow, and packets per flow. In the flow-table, statistics features are extracted at certain intervals and are used in the implementation of this proposed method, making it a light-weight system. However, as a downside, this system has some limitations; it cannot be used for traffic handling that is not based on flow rules. Enhanced power plant monitoring is now possible thanks to modern sensors. As part of the overall cogeneration procedure, cooling towers condense exhaust steam to cool the facility
[55].
In
[56], an SDN-based DDoS blocking application was designed that could block a DDoS attack. For attack detection, this scheme worked in cooperation with two targeted servers. This was a prototype study that was designed to detect HTTP flooding attacks. In
[57], the authors proposed a technique to detect a DDoS attack by using entropy calculations in the SDN controller and a deep auto-encoder approach for feature reduction. To detect attack, a threshold value was implemented, and selection of the threshold value was based on experimental results. A downside to a vast network is that there is a controller bottleneck; it can also affect the reliability affects because the threshold value can change in different scenarios.
In
[58], a system combining a fuzzy interface, a hard detection threshold under attack, and normal states based on real characteristics of traffic was proposed for the detection of DDoS attack. For attack detection, three features were chosen, including flow quality to a server, packet quantity distribution per flow, and interval time distribution. Currently, researchers are working on the detection of flow-based intrusion.
In
[28], IDS was also placed by the authors in the control plane. NSLKDD was the used dataset and a meta-heuristic Bayesian network was the technique used for the classification of traffic. Phase of selection and extraction of features was included in the proposed process to optimize the classifier. Fitness evaluation of the features which were selected was included in the classifier. After that, a Bayesian classifier was used. Seven other approaches are used were the comparison, but this proposed method was more accurate than the other algorithms, having 82.99% accuracy.
3.2.2. Anomaly Detection Using DL Algorithms
In the SDN environment, many approaches have been implemented for anomaly detection to secure the OpenFlow network. In
[59], the author designed a programmable router for a home network by using the programmability feature of the SDN network to provide an ideal location and platform for detecting malicious behavior in SOHO (small office/home office). The four most popular SDN-based anomaly detection methods include NETAD, maximum entropy detector, rate-limiting, and TRW-CB algorithms were implemented in NOX and the OpenFlow compliant switch of SDN. Experimental results of these algorithms showed that, in detecting malicious activities, these algorithms had more accurate results in SOHO networks compared with ISP. Without introducing any new performances overhead, these anomaly detection algorithm work at line rates for the SOHO network.
In
[60], an anomaly-based detection system based on flow was proposed, using a gravitational search algorithm and multi-layer perceptron. Experimental results showed that this proposed system gave a high accuracy ratio in classifying malicious and benign flows. In
[61], an SDN-based NID system was proposed by using SVM. Experimental results showed that the positive alarm rate was very high, with a lower false alarm rate. The traffic system was trained with malicious network traffic, rather than with normal data.
In
[62], the authors proposed an anomaly detection system based on DL algorithms in which flow for anomaly detection and OpenFlow was combined to reduce overhead processing. In this proposed method, the false positive rate was high in detecting attacks because network traffic flow was used for implementation.
The network traffic of social multimedia is continuously increasing due to increases in usage and continuous development of multimedia services and applications. The secure transmission of data requires a network that includes features of quality of service, quality of information, scalability, and reliability. In this context, SDN is a significant network, but energy-aware networking and runtime security affects its capability; thus, to increase SDN reliability, a SDN-based anomaly detection system was proposed in
[1]. By using DL approaches in the context of social multi-media, this system was used to detect any suspicious flow in network traffic. This proposed system consisted of two modules. The first was an anomaly-detection module based on RBM and gradient descent-based SVM for the detection of any suspicious behavior. The second module was end-to-end data delivery to satisfy SDN’s quality of service requirements. For the performance evaluation of this proposed scheme, both benchmark and real-time datasets were used. Experimental evaluation showed that this proposed scheme was very efficient and effective in effective in data delivery and anomaly detection for social multimedia.
Some studies used deep learning for general anomaly detection. In the research
[63], the authors presented DL-based IDS for SDN environments. IDS was implemented as a component of the control plane in both studies, rather than using it as an application. The location allowed for direct interaction to protect the controller. Moving target defense and IDS was the aim of the authors
[63]. To get data from training, a simulated network was generated by the researchers (of about 40,000 packets). A neuroevolutionary model was presented as a light-weight detector for the architecture, by which real-time operation was allowed. Two different detectors, DDoS and worm, were developed to achieve it, with each detector identifying one type of attack. “Neuroevolutionary of Augmenting Topologies (NEAT)” was used by researchers to combine the two detectors. NEAT is a method related to neuro-evolution with a crossover background.
The general environment of SDN was presented by authors in
[4] with unsupervised learning. The auto encoder was used in the approach; the encoder and decoder were the two phases of the auto encoder, used to identify reconstruction error and minimize it for each test sample. TensorFlow was the development library, but the used dataset was not clear.
3.2.3. Specific Circumstances of Network
Some studies also investigated specific circumstances of network. The implementation of IDS based on DL in optimal SDN was presented in
[64]. Attacks were reviewed in the control plane in
[65], and then they were classified into leakage of data, modification of data, unauthorized access, misuse of security policy, and denial of service. Anomaly detection considered features about optimal links, as the process was based on optical networks. These included usage of average bandwidth, destination nodes, formats of modulation, frequent source, and average length of route. Light-path creation, deletion, and modification were some attacks included in related networks. Point-anomaly-based methods were the first detection methods, with a point being used to represent a data instance. Probability was calculated by a user-created algorithm. A sequence-anomaly based method was the second detection method, where the occurrence of anomalies was in a sequence and a cumulative sum approach was used. NSFNET topology was used by the researchers to test an owned dataset, and the results of the detection method showed 85% accuracy.
Numerous compromised nodes were included in the attack by which synchronized traffic with low intensity was generated to disconnect hosts and links from any network. There is a growing reliance on digital measurements in the monitoring and managing of electrical power networks. Recently, wide-area monitoring systems (WAMS) have been established to enhance the situational awareness of complex networks and, by extension, their transmission efficiency
[66].
Coordinated attacks are classified using three DL techniques including convolutional neural networks, artificial neural networks, and LSTM networks. In
[65], a testbed was created by authors for the generation of their dataset in Mininet
[67] with increased traffic. After that, training and testing of the model was performed using this dataset. The results showed that when there was an increase in the speed of the vehicles, the performance was reduced, as well as efficiency. The training time of each algorithm was 100 s. The short time allowed the system to re-train as necessary.