Test-Time Augmentation for Network Anomaly Detection

Test-Time Augmentation for Network Anomaly Detection: Comparison

Please note this is a comparison between Version 2 by Rita Xu and Version 1 by Seffi Cohen.

Machine learning-based Network Intrusion Detection Systems (NIDS) are designed to protect networks by identifying anomalous behaviors or improper uses. In recent years, advanced attacks, such as those mimicking legitimate traffic, have been developed to avoid alerting such systems. Test-Time Augmentation for Network Anomaly Detection (TTANAD), which utilizes test-time augmentation to enhance anomaly detection from the data side.

NIDS
TTA
anomaly detection

1. Introduction

Network anomaly detection plays a crucial role in defending against a wide range of cyber attacks, as modern cyber threats become increasingly sophisticated and persistent in evading detection systems. Intrusion detection (ID) is the core element for network security ^[1]. The main objective of ID is to identify abnormal behaviors and attempts caused by intruders in the network and computer system ^[2]. Network Intrusion Detection Systems (NIDS) combine information from sensors that monitor different network points around the organization’s network. The sensors monitor the incoming and outgoing traffic and can collect informative network features such as packet payloads, IP addresses, ports, number of bytes transmitted, and other network flow characteristics ^[3]. NIDS can be broadly categorized into two main groups: Signature-based NIDS and Anomaly-based NIDS. Signature-based NIDS are static in that the detection methods rely solely on a fixed set called a knowledge database, which needs to be updated over time and requires more human effort and time ^[4]. On the other hand, Anomaly-based NIDS are dynamic because after the normal state of the network is learned, they can detect any irregular and anomalous events ^[5]. The learning involves creating a baseline profile representing normal network behavior based on historical network traffic or a malicious-free network traffic snapshot. As a result, anomaly-based NIDS are considered the most popular detection method because they can detect unknown attacks (zero-day attacks) ^[6]. In real-world cyberspace tasks, storing, transferring, and processing the huge amount of data captured by the sensors is a big issue ^[7]. Sampling techniques have been proposed in several works [8,9,10]^[8][9][10] in order to cope with this challenge. These techniques aim at taking a portion of the data that gives the same characteristics as the whole dataset. Brauckhoff et al. ^[11] detailed the complete processing chain from packet capture to the generation of anomaly detection and included temporal aggregation, which extracts statistics such as mean, standard deviation, etc., from the data that arrives during a time window with a length of T. Temporal aggregation is applied to achieve further data compression and to transform the traffic trace into the observation timescale of interest for anomaly detection ^[11].

Test-time augmentation (TTA) is an application of data augmentation techniques on the test set. TTA techniques generate multiple augmented copies for each test instance, predicting each of them and combining the results with the original instance’s prediction ^[12]. Intuitively, TTA produces different points of view at inference time, thus predicting the given test instance more robustly. Data augmentation can improve the model’s performance without changing its architecture. However, it requires more training resources since more training data are used ^[13]. TTA, on the other hand, is more efficient than data augmentation in the training phase because retraining the model is not required. Several studies, mostly from the vision domain, have used various test-time augmentation techniques in their work [14,15]^[14][15].

The TTA is commonly used in image classification tasks to improve the performance of machine learning models by augmenting the test data. It has been shown to provide a significant boost in the predictive performance of various machine learning models. However, no previous works have utilized TTA for network anomaly detection, primarily because TTA has been predominantly applied to image and text data. The lack of application of TTA in network anomaly detection presents an opportunity to explore the potential benefits of this technique for enhancing the performance of NIDS.

2. Network Anomaly Detection

Anomaly detection can be defined as identifying patterns in the data that do not conform to expected behavior in some context ^[16]. Anomaly detection modeling can be broadly categorized into several types of techniques: statistical methods, neighbor-based methods, and dimensionality-based methods ^[16]. In statistical methods, the low probability samples under the learned distribution will be considered as an anomaly. Neighbor-based methods assume that normal data has significantly more neighbors than anomalous data. Dimensionality reduction-based methods try to find an approximation of the data using a combination of attributes that capture the bulk of the variability in the data. Additionally, anomaly detection can be accomplished using reconstruction methods that reconstruct the input from latent space. The reconstruction error of anomalous instances will be higher as the model has been adapted to reconstruct only normal data ^[17]. Despite the progress in this field, detecting sophisticated attacks remains a significant challenge due to the evolving nature of threats and the increasing volume of network traffic. In ourthe experiments, weresearchers used an Autoencoder as a reconstruction-based anomaly detector, an Isolation Forest as a statistical-based anomaly detector, and a Local Outlier Factor as a neighbor-based anomaly detector.

2.1. Autoencoder-Based Anomaly Detection

A method was proposed by Dau ^[18] that uses a replicator neural network, also referred to as an autoencoder, for anomaly detection. It can work in both single and multiple-class settings. The network is trained to reconstruct only “normal” observations, so it is assumed that normal samples should have low reconstruction error. Conversely, anomalous samples are expected to have higher reconstruction error because the network is not trained to replicate them. Autoencoders have been extensively studied for network intrusion detection (NID) [19,20,21,22,23,24]^{[19][20][21][22][23][24]}. However, a major weakness of autoencoder-based anomaly detectors is their struggle to identify anomalies in complex or noisy data accurately. This is because autoencoders aim to reproduce the input data closely. However, if the input data are complicated or noisy, the autoencoder may fail to capture the underlying patterns, failing to identify anomalies. OurThe proposed method, TTANAD, is designed to enhance the performance of various anomaly detection algorithms, including autoencoders, by providing additional perspectives on the test data through temporal augmentations.

2.2. Local Outlier Factor Anomaly Detection

The Local Outlier Factor (LOF) was proposed by Breunig ^[25] as an unsupervised anomaly detection technique that calculates the anomaly score based on the deviation of a data point’s local density compared to its neighbors. It classifies samples with significantly lower density than their neighbors as outliers. The method involves determining the local density of a sample using its k-nearest neighbors, and the LOF score of observation is calculated as the ratio of its k-nearest neighbors’ average local density to its own local density. Normal samples are expected to have a similar local density to their neighbors, while abnormal data are expected to have a much lower local density. LOF has been widely studied for network intrusion detection [25^{[25][26][27][28][29][30]},26,27,28,29,30], but its internal density-based mechanism can make it less effective at detecting anomalies that are not well-separated from normal data points or are located in low-density regions of the data. OurThe proposed method addresses this weakness by providing augmented instances for each sample with different values. One of these augmented instances has a better chance of separating anomalies due to its feature. In addition to autoencoders, weresearchers also evaluate the effectiveness of ourthe proposed method, TTANAD, by employing the LOF algorithm as one of the anomaly detectors in ourthe experiments. This allows uresearchers to assess the performance improvements offered by TTANAD across different anomaly detection techniques.

2.3. Isolation Forest Anomaly Detection

The Isolation Forest method, introduced by Liu ^[31], is a technique for identifying anomalies by constructing decision trees. The method works by randomly selecting a feature and splitting the values of the selected feature, resulting in partitions. Anomalies are instances with short average path lengths on the trees as they are less common and require fewer splits to separate them from regular observations. Despite being widely used for Network Intrusion Detection [32^{[32][33][34][35][36]},33,34,35,36], Isolation Forests are prone to be impacted by outliers and instances that significantly differ from the rest of the data, leading to possible false positive or false negative results. The use of TTA should improve robustness by providing more points of view for each instance. The isolation forest algorithm is another anomaly detector that wresearchers incorporate as part of ourthe experiments, similar to autoencoders and LOF.

3. Test-Time Augmentation

Test-time augmentation is the process of producing several enhanced copies of each sample in the test set, applying a prediction for each, then returning an ensemble of those predictions. TTA was extensively shown to improve results in many domains, most notably the vision domain. In Alexnet ^[15] the authors also applied TTA by averaging the predictions on ten randomly cropped parts of the inference image. Cohen et al. ^[37] proposed Test-Time Augmentation for the tabular anomaly Detection technique, a TTA-based method to improve anomaly detection performance on all kinds of tabular data. Shanmugam et al. ^[12] determine the augmentations used in TTA by setting an appropriate weight for each augmentation created. Their method significantly outperforms existing approaches by focusing on the factors influencing TTA augmentation and finding the optimal weight per augmentation. A study by Cohen et al. ^[38] presented state-of-the-art results using TTA to predict Intensive Care Unit (ICU) survival. Although TTA has been successfully applied to images, text, and tabular data, its application to network anomaly detection has not been extensively explored.

References

Li, Y.; Ma, R.; Jiao, R. A hybrid malicious code detection method based on deep learning. Int. J. Secur. Appl. 2015, 9, 205–216.
Kwon, D.; Kim, H.; Kim, J.; Suh, S.C.; Kim, I.; Kim, K.J. A survey of deep learning-based network anomaly detection. Clust. Comput. 2019, 22, 949–961.
Fernandes, G.; Rodrigues, J.J.; Carvalho, L.F.; Al-Muhtadi, J.F.; Proença, M.L. A comprehensive survey on network anomaly detection. Telecommun. Syst. 2019, 70, 447–489.
Garcia-Teodoro, P.; Diaz-Verdejo, J.; Maciá-Fernández, G.; Vázquez, E. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 2009, 28, 18–28.
Zhang, J.; Zulkernine, M. Anomaly based network intrusion detection with unsupervised outlier detection. In Proceedings of the 2006 IEEE International Conference on Communications, Istanbul, Turkey, 11–15 June 2006; Volume 5, pp. 2388–2393.
Xin, Y.; Kong, L.; Liu, Z.; Chen, Y.; Li, Y.; Zhu, H.; Gao, M.; Hou, H.; Wang, C. Machine learning and deep learning methods for cybersecurity. IEEE Access 2018, 6, 35365–35381.
Su, L.; Yao, Y.; Li, N.; Liu, J.; Lu, Z.; Liu, B. Hierarchical Clustering Based Network Traffic Data Reduction for Improving Suspicious Flow Detection. In Proceedings of the 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), New York, NY, USA, 1–3 August 2018; pp. 744–753.
Jiang, K.; Wang, W.; Wang, A.; Wu, H. Network intrusion detection combined hybrid sampling with deep hierarchical network. IEEE Access 2020, 8, 32464–32476.
Wang, Q.; Ouyang, X.; Zhan, J. A classification algorithm based on data clustering and data reduction for intrusion detection system over big data. KSII Trans. Internet Inf. Syst. (TIIS) 2019, 13, 3714–3732.
Liu, L.; Wang, P.; Lin, J.; Liu, L. Intrusion detection of imbalanced network traffic based on machine learning and deep learning. IEEE Access 2020, 9, 7550–7563.
Brauckhoff, D.; Salamatian, K.; May, M. A signal processing view on packet sampling and anomaly detection. In Proceedings of the 2010 IEEE INFOCOM, San Diego, CA, USA, 14–19 March 2010; pp. 1–9.
Shanmugam, D.; Blalock, D.; Balakrishnan, G.; Guttag, J. When and Why Test-Time Augmentation Works. arXiv 2020, arXiv:2011.11156.
Mikołajczyk, A.; Grochowski, M. Data augmentation for improving deep learning in image classification problem. In Proceedings of the 2018 International Interdisciplinary Ph.D. Workshop (IIPhDW), Swinoujscie, Poland, 9–12 May 2018; pp. 117–122.
Wang, G.; Li, W.; Aertsen, M.; Deprest, J.; Ourselin, S.; Vercauteren, T. Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks. Neurocomputing 2019, 338, 34–45.
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105.
Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. (CSUR) 2009, 41, 1–58.
Chalapathy, R.; Chawla, S. Deep learning for anomaly detection: A survey. arXiv 2019, arXiv:1901.03407.
Dau, H.A.; Ciesielski, V.; Song, A. Anomaly detection using replicator neural networks trained on examples of one class. In Proceedings of the Asia-Pacific Conference on Simulated Evolution and Learning, Dunedin, New Zealand, 15–18 December 2014; pp. 311–322.
Farahnakian, F.; Heikkonen, J. A deep auto-encoder based approach for intrusion detection system. In Proceedings of the 20th International Conference on Advanced Communication Technology (ICACT), Online, Republic of Korea, 11–14 February 2018; pp. 178–183.
Azmin, S.; Islam, A.M.A.A. Network intrusion detection system based on conditional variational laplace autoencoder. In Proceedings of the 7th International Conference on Networking, Systems and Security, Dhaka, Bangladesh, 22–24 December 2020; pp. 82–88.
Yang, L.; Song, Y.; Gao, S.; Hu, A.; Xiao, B. Griffin: Real-time network intrusion detection system via ensemble of autoencoder in SDN. IEEE Trans. Netw. Serv. Manag. 2022, 19, 2269–2281.
Li, X.; Chen, W.; Zhang, Q.; Wu, L. Building auto-encoder intrusion detection system based on random forest feature selection. Comput. Secur. 2020, 95, 101851.
Rao, K.N.; Rao, K.V.; PVGD, P.R. A hybrid intrusion detection system based on sparse autoencoder and deep neural network. Comput. Commun. 2021, 180, 77–88.
Muhammad, G.; Hossain, M.S.; Garg, S. Stacked autoencoder-based intrusion detection system to combat financial fraudulent. IEEE Internet Things J. 2020, 10, 2071–2078.
Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 93–104.
Gulhare, A.K.; Badholia, A.; Sharma, A. Mean-Shift and Local Outlier Factor-Based Ensemble Machine Learning Approach for Anomaly Detection in IoT Devices. In Proceedings of the International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 20–22 July 2022; pp. 649–656.
Omar, M. Malware Anomaly Detection Using Local Outlier Factor Technique. In Machine Learning for Cybersecurity: Innovative Deep Learning Solutions; Springer: Berlin/Heidelberg, Germany, 2022; pp. 37–48.
Tang, J.; Ngan, H.Y. Traffic outlier detection by density-based bounded local outlier factors. Inf. Technol. Ind. 2016, 4.
Auskalnis, J.; Paulauskas, N.; Baskys, A. Application of local outlier factor algorithm to detect anomalies in computer network. Elektron. Elektrotechnika 2018, 24, 96–99.
Madhupriya, G.; Shalinie, S.M.; Rajeshwari, A.R. Detecting DDoS attack in cloud computing using local outlier factors. In Proceedings of the 2nd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 11–12 May 2018; pp. 859–863.
Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation Forest. In Proceedings of the Isolation Forest; IEEE Computer Society: New York, NY, USA, 2008; pp. 413–422.
Shukla, A.K.; Srivastav, S.; Kumar, S.; Muhuri, P.K. UInDeSI4. 0: An efficient Unsupervised Intrusion Detection System for network traffic flow in Industry 4.0 ecosystem. Eng. Appl. Artif. Intell. 2023, 120, 105848.
AbuAlghanam, O.; Alazzam, H.; Alhenawi, E.; Qatawneh, M.; Adwan, O. Fusion-based anomaly detection system using modified isolation forest for internet of things. J. Ambient. Intell. Humaniz. Comput. 2022, 14, 1–15.
Chiba, Z.; Abghour, N.; Moussaid, K.; Omri, A.E.; Rida, M. Newest collaborative and hybrid network intrusion detection framework based on suricata and isolation forest algorithm. In Proceedings of the 4th International Conference on Smart City Applications, Casablanca, Morocco, 2–4 October 2019; pp. 1–11.
Laskar, M.T.R.; Huang, J.X.; Smetana, V.; Stewart, C.; Pouw, K.; An, A.; Chan, S.; Liu, L. Extending isolation forest for anomaly detection in big data via K-means. ACM Trans.-Cyber-Phys. Syst. (TCPS) 2021, 5, 1–26.
Ripan, R.C.; Sarker, I.H.; Anwar, M.M.; Furhad, M.H.; Rahat, F.; Hoque, M.M.; Sarfraz, M. An isolation forest learning based outlier detection approach for effectively classifying cyber anomalies. In Proceedings of the Hybrid Intelligent Systems: 20th International Conference on Hybrid Intelligent Systems (HIS 2020), Virtual, 14–16 December 2020; pp. 270–279.
Cohen, S.; Goldshlager, N.; Rokach, L.; Shapira, B. Boosting Anomaly Detection Using Unsupervised Diverse Test-Time Augmentation. Inf. Sci. 2023, 626, 821–836.
Cohen, S.; Dagan, N.; Cohen-Inger, N.; Ofer, D.; Rokach, L. ICU survival prediction incorporating test-time augmentation to improve the accuracy of ensemble-based models. IEEE Access 2021, 9, 91584–91592.