Deep Learning for IDSs in Time Series Data

Deep Learning for IDSs in Time Series Data: History

View Latest Version

Please note this is an old version of this entry, which may differ significantly from the current revision.

Subjects: Computer Science, Artificial Intelligence

Contributor:

Konstantinos Psychogyios

Nikolaos Nikolaou

Classification-based intrusion detection systems (IDSs) use machine learning algorithms to classify incoming data into different categories based on a set of features. Even though classification-based IDSs are effective in detecting known attacks, they can be less effective in identifying new and unknown attacks that have a small correlation with the training dataset. On the other hand, anomaly detection-based approaches use statistical models and machine learning algorithms to establish a baseline of normal behavior and identify deviations from that baseline.

intrusion detection system
deep learning
time series forecasting

1. Introduction

The universality of the internet and computer networks has revolutionized the way we interact with each other, enabling information sharing and collaboration at an unprecedented scale. However, this pervasive connectivity has also created new opportunities for malicious actors to exploit vulnerabilities and gain unauthorized access to sensitive information [1]. As a result, the importance of effective intrusion detection systems cannot be overstated, and the need for proactive notification is emerging [2]. IDS stands for a hardware device or software application used to monitor and detect suspicious network traffic and potential security breaches flagging malicious activity. This monitoring takes place at the packet level and, thus, such a system can distinguish malicious from benign packets. Traditionally, this component has been implemented as a firewall and later as a rule-based expert system. Due to the rise in ML in recent years [3,4,5,6], state-of-the-art approaches are based on ML technologies applied to data logs from IDSs, to classify packets as suspicious or not [7,8,9,10,11].

Moreover, multivariate time series prediction [12,13,14] is a sophisticated analytical approach that involves forecasting future values of multiple interrelated variables over time. Unlike univariate time series analysis, which focuses on a single variable, multivariate time series prediction considers the dynamic interactions and dependencies among several variables simultaneously. This method is particularly relevant in fields such as finance [15,16,17], weather forecasting [18], and industrial processes [19], where various factors influence the outcome of interest. The complexity lies in capturing the intricate relationships among different variables and understanding how changes in one can impact the others. Advanced machine learning techniques, including recurrent neural networks (RNNs) [20], LSTMs, and autoregressive integrated moving average (ARIMA) models [21], are commonly employed to handle the complexity of multivariate time series data. Accurate predictions in this context can provide valuable insights into informed decision-making, risk management, and optimizing processes in diverse domains.

Machine learning IDSs can be broadly categorized into two types: (i) classification-based [22,23,24] and (ii) anomaly-based [25]. Classification-based IDSs use machine learning algorithms to classify incoming data into different categories based on a set of features. Even though classification-based IDSs are effective in detecting known attacks, they can be less effective in identifying new and unknown attacks that have a small correlation with the training dataset. On the other hand, anomaly detection-based approaches use statistical models and machine learning algorithms to establish a baseline of normal behavior and identify deviations from that baseline. Unlike classification-based IDSs, anomaly-based IDSs can detect unknown or novel attacks that have not been previously seen. However apart from this advantage, these models cannot easily specify the type of attack and perform worse than classification approaches for known data types [26,27,28].

2. Machine Learning Intrusion Detection Systems

The field of IDSs using machine learning has seen extensive research with new methods and datasets emerging frequently [31,32]. Predicting attacks through IDS log analysis can serve as a proactive security notification feed for an organization, enhancing complementary analysis and the assessment of vulnerabilities, as pursued in [33]. Maseer, Z.K et al. [34] evaluated many machine learning classification methods on the CIC-IDS2017 [35] dataset. Regarding the pre-processing steps, they conducted one-hot encoding and normalization. They also employed parameter tuning and k-fold cross-validation for the training phase. The methods employed were the standard classification approaches, such as random forest, support vector machines, convolutional neural networks, etc., for the binary classification task. They measured accuracy (with binary accuracy, F1 score, precision, etc.) and training/testing times. The results showed that KNN, random forest, and naive Bayes achieve excellent results for these metrics.

Imran, M. et al. [36] evaluated custom autoencoder-based models against KDD-99 [37] for the multiclass classification problem. They developed a non-symmetric deep autoencoder, which was either used as a single model (NDAE) or in a stacked manner (S-NDAE). The term non-symmetric refers to the architectures of the encoder and decoder models, which in this case are not similar (symmetric). They evaluated the performance of these models with common metrics, namely accuracy, precision, etc. The results showed that this method achieves better metrics compared to different state-of-the-art approaches. Saba, T. et al. [38] developed an intrusion detection model that was tested with BoT-IoT [39] and NID datasets (https://www.kaggle.com/datasets/sampadab17/network-intrusion-detection 23 November 2023). The proposed model was a convolutional neural network. The BoT-IoT dataset was used for the multiclass classification task, whereas NID was used for binary classification. The results showed that the proposed model was able to classify the packets with an accuracy of 95%.

Pranto, M.B et al. [40] tested many classification methods using the KDD-99 dataset. Regarding the pre-processing steps, they emphasize feature selection using famous techniques (selecting K-Best) to achieve better accuracy in the classification task. From the tested methods, random forest performed the best, reaching an accuracy of 99% for the binary classification task.

Tahri, R. et al. [41] compared many articles that proposed IDSs based on the UNSW-NB15 dataset, and more specifically, on the Kaggle 100.000 sample version. In their survey, they found that random forest was the best-performing model in most of the studies, reaching an accuracy of up to 98%, specificity of up to 98%, and sensitivity of 94% for the binary classification task.

Regarding approaches that address this problem as a time series instance, Duque A. S. et al. [42] proposed the use of machine learning-based intrusion detection techniques or analyzing industrial time series data. The paper evaluated three different algorithms, namely matrix profiles, seasonal autoregressive-integrated moving average (SARIMA), and LSTM-based [43] neural networks, using an industrial dataset based on the Modbus/TCP protocol. The paper demonstrated that the matrix profile algorithm outperformed the other models in detecting anomalies in industrial time series data, requiring minimal parameterization effort.

3. Multivariate Time Series Prediction

When reviewing related work on multivariate time series prediction, researchers have explored various methodologies to enhance forecasting accuracy and address the complexities inherent in analyzing multiple interrelated variables over time [44,45,46]. Additionally, existing literature has investigated diverse applications of multivariate time series prediction, ranging from financial markets to healthcare, contributing valuable insights into the challenges and advancements within this interdisciplinary field.

Bloemheuvel, S. et al. [47] introduced a novel graph neural network (GNN)-based architecture, TISER-GCN, for multivariate time series regression, in the context of sensor networks. It addresses the limitations of existing deep learning techniques that focus solely on time series data, neglecting spatial relations among geographically distributed sensors. The proposed model was evaluated using high-frequency seismic data, demonstrating its effectiveness when compared to baseline models and traditional machine learning methods, with contributions including the development of a flexible architecture for various use cases, a thorough evaluation of diverse seismic datasets, and a systematic analysis of the model’s capabilities through extensive experimentation.

Gorbett, M. et al. [48] proposed an extension of the lottery ticket hypothesis to time series Transformers, demonstrating that pruning and binarizing the weights of the model maintains accuracy similar to that of a dense Transformer. Employing the Biprop algorithm, a technique proven on complex datasets, the combination of weight binarization and pruning was applied to achieve computational advantages, reducing non-zero floating-point operations (FLOPs) and storage sizes. The approach was specifically tested on multivariate time series modeling, showcasing its effectiveness in tasks like classification, anomaly detection, and forecasting, with potential applications in resource-constrained environments such as IoT devices, engines, and spacecraft.

Wang, D. et al. [49] addressed the importance of accurate predictions in various applications of multivariate time series, such as stock prices, traffic prediction, and COVID-19 spread forecasts. They introduced the challenges faced in capturing both temporal relationships and variable dependencies in existing forecasting methods, emphasizing the need for a comprehensive understanding of underlying patterns. The work proposed a spatiotemporal self-attention-based LSTNet model, integrating spatial and temporal self-attention mechanisms to capture relationships among variables and historical observations. The contributions included the effectiveness of the proposed model in capturing spatiotemporal relationships, a novel objective function to address imbalanced errors among variables, and extensive experiments demonstrating the efficiency of LSTM-based methods in multivariate time series forecasting.

This entry is adapted from the peer-reviewed paper 10.3390/fi16030073

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.