Utilizing machine learning (ML) based methodologies for Network Intrusion Detection Systems (NIDSs) engenders valid concerns, primarily stemming from the inherent vulnerabilities of current ML models to various security threats. Predominantly, our focus centers on two paramount threats intrinsic to ML-based NIDS: adversarial attacks and distribution shifts. We advocate for the comprehensive establishment and sustained maintenance of the robustness of ML-based NIDS throughout its entire lifecycle, enabling it to effectively adress unforeseen exigenciets.
1. Introduction
Computer networks have revolutionized the way humans live, work, and communicate, and their continued success and advancement will undoubtedly shape the future of our interconnected world. With the development of computer networks, the attack surface has increased too. To protect networks from various security threats, many defense mechanisms against network attacks have been proposed, such as network intrusion detection systems (NIDSs). In recent decades, machine learning (ML) methods have been considered as a solution for solving intrusion detection problems. ML has been widely applied in a broad range of industries and domains. For instance, ML applications in many domains, such as computer vision (CV)
[1] and natural language processing (NLP)
[2], have achieved significant success in the real world. At the same time, many network security tasks have also been built on the benefit of leveraging ML techniques. Recent NIDS advances
[3][4] take advantage of deep learning (DL) to drive malicious network traffic detection and classifications. ML-based NIDSs can automatically extract high-level features by learning from training datasets to achieve excellent detection performance and be more convenient than traditional signature-based NIDSs.
Despite the impressive performance of machine learning systems, their robustness remains elusive and constitutes a critical issue that impedes large-scale adoption
[5]. Primarily for security tasks, such as NIDSs, robustness is the main concern for trustworthy real-world ML applications
[6]. The considerable demand for robustness partially constrains the real-world implementation of ML-based NIDSs
[7]. On one hand, research on the reliability and trustworthiness of ML-based NIDSs is still in the early stage
[8][9]. On the other hand, numerous studies
[7][10] highlight the concern that the vulnerability of applied ML will be part of the expanding attack surface. Furthermore, practical applications are crucial for validating theoretical advancements and gaining real-world insights
[11]. In order to accelerate ML-based NIDS research with practical applications like CV and NLP, addressing the robustness of ML-based NIDSs should be a top priority. In acknowledgment of the robustness requirement, an expanding collection of literature centers around the development and evaluation of robust ML systems
[5] for not only NIDSs but also other fields. However, the increasing efforts at ML robustness are dispersed in various stages of the ML workflow and focus on different viewpoints
[12]. Given that robustness in ML often entails multiple meanings depending on the context and use cases
[13], a systematic survey on the state-of-the-art robustness studies for ML-based NIDSs is important.
2. Vulnerabilities of Machine Learning for Network Security
2.1. The Concepts Related to ML Robustness
Robustness is a term that has become encompassed in a spectrum of interpretations and even overloaded
[14]. For instance, robustness encompasses a wide range of aspects, including but not limited to raw task performance on test sets, the ability to sustain task performance on manipulated or modified inputs, generalization within and across domains, and resilience against adversarial attacks.
Figure 1. The concepts related to ML robustness.
Trustworthy ML refers to ML models that are designed, deployed, and utilized in a manner that prioritizes ethical considerations, transparency (interpretation), accountability, fairness, and reliability (robustness) as shown in Figure 1. The robustness of ML corresponds to the reliability subfield of trustworthy ML. In the context of machine learning, generalization refers to a trained model’s capacity to make accurate predictions on new, unseen data that were not part of its training set. Depending on the data domain/distribution that the unseen data belong to, two cases of generalization are presented in the literature
[15]. The first case is denoted as in-domain (ID) generalization, in which the unseen data are sampled from the same domain/distribution as that of the training dataset. For the second case, the model’s capacity for correctly inferring unseen data that are sampled from a different domain/distribution is denoted as out-of-domain (OOD) generalization. Normally, the OOD generalization is basically the same as the robustness against distribution shifts. Distribution shifts refer to the phenomenon where the input data of ML models turn out to be different from the source distribution of the training data. Adversarial attacks are a vulnerability of machine learning where deliberately crafted, small, imperceptible perturbations are added to input data, causing a trained model to misclassify or produce unintended outputs.
2.2. The Threats against ML-Based NIDSs
Adversarial attacks aim to fool the ML model by perturbing the data [16]. Based on the different stages when the perturbed data is used, adversarial attacks can be classified into different types as follows.
- Poisoning Attacks: In the training stage of ML workflow, poisoning attacks aim to perturb the training dataset by changing the inputs or flitting the labels so that they influence the trained model's future capability. If the attacker adds a trigger to training data so that they can force the ML model to execute particular behaviors in the inference stage, those attacks are known as backdoor attacks.
- Evasion Attacks: In the inference stage, evasion attacks refer to a type of attack that attempts to manipulate or exploit a machine learning model by perturbing input data in such a way that it confuses or misleads the model's predictions.
Based on the attacker's knowledge of target ML models, the adversarial attacks can be divided into three cases as follows:
- White-box Attacks: The attackers know everything about the target ML models, such as the decision boundary. In this case, attackers can modify the inputs with the minimum perturbation but with a very high success rate [17].
- Gray-box Attacks: The attackers only have a part of the knowledge of target ML models and can access target models and observe their behaviors [18].
- Black-box Attacks: The attackers do not have any information about the target ML models and cannot access the target models' responses.
Regarding ML-based NIDSs, adversarial attacks can be categorized into two types based on the level of input perturbation applied:
- Feature-based Attacks: This type of adversarial attack against ML-based NIDSs focuses on perturbing the extracted features that represent a network traffic flow.
- Traffic-based Attacks: Given the feature extraction component is included in NIDSs, it is impractical to directly modify the extracted features in real-world scenarios. Traffic-based attacks refer to those attack methods that focus on modifying the original network traffic [19].
Distribution shifts will cause ML models to fail, such as being less accurate. Since the data is different from the source distribution, another term normally used to represent the robustness against distribution shifts is out-of-distribution (OOD) generalization. For varying data types, distribution shifts are normally classified into different subtypes [20] based on the causes.
Many factors can cause distribution shifts in network traffic data, such as changing network environments, user behavior changing over time, and new advanced protocol versions. Additionally, given current ML-based NIDS methods work on varying types of data, including tabular [21], images [22], and sequences [23], the distribution shifts in network data have a complex composition. Although varying types of distribution shifts challenge the robustness of ML-based NIDSs, the studies related to the distribution shifts in ML-based NIDSs or network traffic analysis have not received enough attention. Existing works [24] only focus on one type of shifting cause, such as temporal drift.
3. Fostering Robustness in the Whole Design Lifecycle
Improving robustness necessitates coordinated efforts across multiple stages in the ML application lifecycle, encompassing data sanitization, robust model development, anomaly monitoring, and risk auditing. Conversely, the breakdown of trust in any individual link or aspect can significantly compromise the overall trustworthiness of the entire system. Thus, a holistic approach to maintaining trust throughout all stages of the AI system's lifecycle is essential to ensure its reliability and integrity [25].
Considering that ML model robustness is not a one-time achievement but an ongoing process that requires vigilance, updates, and evaluation, we investigate the robustness of the ML-based NIDS model by following the sequential stages in ML workflow. In the ML workflow, there are six main stages: (1) Data collection and processing; (2) Model structure design; (3) Training and optimization; (4) Fine-tuning (which is an optional stage); (5) Evaluation; (6) Application inference. From the view of model robustness, we consider obtaining the weights of models as a split point because once the training is finished, the robustness of the model is roughly settled down. Hence, we group the first three stages together for the reason that during those stages, robustness is built into the learning model. Furthermore, we group the rest three stages together because the model robustness can still be patched up in those stages.
3.1. Robustness Building-in Techniques
Considering that ML-based NIDSs heavily rely on data, improving the data quality will improve the robustness of ML models. A lot of research in the field of CV [26][27][28] and NLP [29][30] reports that data augmentation can improve out-of-distribution robustness. However, due to the huge difference between network traffic and images or text, those methods may not be able to be directly applied to ML-based NIDS or other network security tasks. In the training stage, the optimization methods also affect the robustness of the trained ML model. To glean information from the data itself, Contrastive learning (CL) creates pairings of positive and negative samples. Building a contrastive loss function is the fundamental concept behind contrastive learning. The model can compare similar and different data, draw on comparable samples, and draw out distinct samples. The performance of contrastive learning models depends critically on the design of positive and negative sampling strategies, and the robustness of the model will be greatly influenced by the difficulty of the suggested sample pairs. Similarly, some adversarial learning methods combine data augmentation and contrastive learning. Such as self-supervised adversarial learning, unlike traditional CL, uses adversarial augmentation to make hard sample mining easier.
3.2. Robustness Patching-up Techniques
For the fine-tuning, evaluation, and application inference stages, manipulating or measuring the distance can improve or evaluate ML models' robustness. Fine-tuning models with adversarial samples can improve robustness against adversarial attacks. Meanwhile, using fine-tuning methods with data from other domains can help a trained model adapt to a new domain, which can be used for solving the concept drift issue in some cases. The evaluation stage is important for verifying a trained ML model's robustness degree. Robustness certification studies aim to certify the robustness of an ML-based classifier against adversarial attacks/perturbations. One line of those works focuses on using the randomized smoothing technique to solve the problem [31]. Randomized smoothing-based robustness certification methods utilize a large amount of noised data samples to estimate the model's robustness against adversarial attacks. Recently, Wang et al. [32 ]propose a robustness certification framework, named BARS, for DL-based traffic analysis, which can be considered as an upstream task of NIDSs. To evaluate the robustness against distribution shifts, cross-dataset evaluation methods [33][34] are proposed to show the importance of improving the OOD generalization.