Optimizing Cybersecurity Attack Detection in Computer Networks: Comparison
Please note this is a comparison between Version 1 by Hadi Najafi Mohsenabad and Version 2 by Rita Xu.

In computer network security, the escalating use of computer networks and the corresponding increase in cyberattacks have propelled Intrusion Detection Systems (IDSs) to the forefront of research in computer science. IDSs are a crucial security technology that diligently monitor network traffic and host activities to identify unauthorized or malicious behavior.

  • network security
  • feature selection
  • Intrusion Detection Systems
  • deep learning
  • cyberattack

1. Introduction

Large-scale local area networks use anomaly detection as a primary technique for solving security issues. However, it can be challenging to extract accurate traffic data for anomaly identification [1]. Intrusion detection in computer networks is crucial as it affects communication and security; yet, finding network intrusions remains a challenge [1]. Network intrusion detection is complicated due to the need for significant data for training advanced machine learning models [1].
The emergence of new threats that older systems cannot detect presents significant difficulties in network intrusion detection [2]. The increasing complexity and volume of real-time traffic, driven by the rapid development and adoption of technologies such as 5G, IoT, and Cloud Computing, have led to more sophisticated and varied cyberattacks, posing significant challenges to cyberspace security [3]. Network Intrusion Detection Systems (NIDSs), which serve as the firewall’s backup line of defense, function to develop strategies, provide real-time monitoring, swiftly identify malicious network attacks, and implement dynamic security measures [3].
Cybersecurity research has become essential given how much networks are used in the modern world [4]. Current Intrusion Detection Systems (IDSs) have failed to identify new attacks, improve detection accuracy, and lower false alarm rates despite years of research. To solve these issues, many scholars have concentrated on developing IDSs with machine learning methods [4]. Machine learning offers accurate and automatic identification of differences between normal and abnormal data, including recognizing unidentified attacks, while providing excellent generalizability [4].
Feature Selection (FS) techniques play a crucial role in data preprocessing by enabling effective data reduction and supporting the identification of precise data models [5]. Although extensive searching for the ideal feature subset is often impractical, the literature offers numerous search algorithms for overcoming this difficulty [5]. Four widely used techniques employed in feature selection for effective network intrusion detection are the filter, wrapper, embedding, and hybrid methods [5].
The dataset KDD CUP’99 challenge results demonstrated the superiority of Long Short-Term Memory (LSTM) classifiers over other static classifiers, underscoring the significance of LSTM networks for intrusion detection [6]. LSTM networks can learn from past data and establish connections between connection records, making them compelling in intrusion detection [6]. Hyperparameters govern the learning process and model quality in almost all machine learning algorithms. Adjusting these hyperparameters can lead to the generation of infinite models with varying performance levels and learning rates, depending on the specific techniques used for machine learning [7].
Exponential data growth in recent times has posed significant challenges in the classification process [8]. Feature selection solves this problem, improving data classification accuracy while reducing data complexity [8] because many time-consuming features play a role in the detection process.
In order to increase the efficacy of Network Intrusion Detection Systems (NIDSs), feature selection is essential [9]. The feature selection approach influences the time needed to observe traffic behavior and the degree of accuracy improvement. The four approaches to feature selection include wrapper, hybrid techniques, filtering, and embedding [9].
As a result, during the feature selection process, subsets of 5, 9, and 10 features were chosen from a pool of 67 features. During the data-cleaning phase, 13 features out of the initial 80 were removed, which is noteworthy. Figure 1 visually depicts the Intrusion Detection System approach using a flowchart.
Figure 1. The Intrusion Detection System.
Figure 2 depicts the network-based Intrusion Detection System following the rephrased active voice.
Figure 2. Network-Based Intrusion Detection System.

2. Optimizing Cybersecurity Attack Detection in Computer Networks

In the comparison, the Generative Adversarial Network (GAN) algorithm outperformed the Artificial Neural Network (ANN), Multilayer Perceptron (MLP), K-Nearest Neighbor (KNN), Decision Tree (DT), Convolutional Neural Networks (CNNs), and Auto Encoder techniques. The Generative Adversarial Network (GAN) method previously achieved a 99.34 percent classification accuracy [10]. The research described in [11] used machine learning and data mining techniques to combat network intrusion. The focus was on improving the intrusion detection rate while reducing false alarms. The paper [11] explores rule learning using RIPPER on imbalanced intrusion datasets and proposes a combination of oversampling, under-sampling, and clustering-based approaches to address the data imbalance. The evaluation of real-world intrusion datasets showed that oversampling through synthetic generation outperforms replication, and the clustering-based approach enhanced intrusion detection beyond synthetic generation alone. As the aforementioned article describes, an Intrusion Detection System (IDS) is a network security technology that monitors hostile activity on computers or networks. However, due to the dynamic and complicated nature of cyberattacks, typical IDS techniques require assistance. As a result, Ref. [11] proposes that effective adaptive methods, including machine learning approaches, can result in reduced false alarms, increased detection rates, and affordable computational costs.
The performances of several methods based on conventional Artificial Intelligence (AI) and Computational Intelligence (CI) have previously been compared and reviewed. This has highlighted how the characteristics of CI techniques can be utilized to build effective IDSs [12].
Pervez, M.S., Farid, D.M. [13] reviewed the classification accuracy of various classifiers using different dimensional features. Liu, H. and Hussain, F. [14] proposed a novel approach that combines classification and techniques for the multiclass NSL-KDD Cup99 dataset. The proposed method utilizes Support Vector Machines (SVMs). Shapoorifard, H., Shamsinejad, P. [14] researched novel technologies to enhance the classification accuracy of Center and Nearest Neighbors (CANN) Intrusion Detection Systems. They evaluated the effectiveness of these technologies using the NSLKDD Cup99 dataset.
In ref. [15], the authors discuss the hazards of hostile assaults on the network due to widespread internet use. They emphasize how crucial effective Intrusion Detection Systems are for categorizing and foreseeing cyberattacks. According to [15], a hybrid model that combines a firefly-based machine learning technique with Principal Component Analysis (PCA) can be used to categorize Intrusion Detection System (IDS) datasets. The XGBoost method uses hybrid PCA–firefly for dimensionality reduction in the model and classification and alters the dataset with One-Hot encoding. The proposed model outperformed the most advanced machine learning methods based on the experimental data.
In ref. [16], the authors selected 55 features out of an initial 80, achieving a final accuracy of 95% by using the Deep Neural Network (DNN) and Binary Particle Swarm Optimization (BPSO) approaches and the classification method.
Lava, A., Savant, P. [17] describe a machine learning-based NIDS model for binary classification of the CSECIC-IDS2018 using the AWS dataset. The abuse detection techniques of logistic regression, random forest, and gradient boosting were all implemented. However, researchwers discovered that gradient boosting outperformed the others. The model’s performance on the test dataset demonstrated its generalizability. Created using gradient boosting, the estimator exhibited a recall and precision of 0.98, making it suitable for practical applications. Applying synthetic minority oversampling to categorize minority classes of assaults more accurately in the future and running more tests on new data from various network environments using the CIC-IDS-2017 dataset are two potential enhancements [17].
In a survey, Kwon, D., Kim, H. et al. [18] investigated deep learning approaches in anomaly-based network intrusion detection. The authors presented a comprehensive outline of anomaly detection techniques, encompassing data reduction, dimension reduction, classification, and deep learning approaches. The survey [18] covered prior research on deep learning techniques for network anomaly detection, emphasizing its advantages and possible drawbacks. Additionally, the authors conducted local experiments and discovered encouraging outcomes when analyzing network traffic using a Fully Convolutional Network (FCN) model. Regarding anomaly detection accuracy, the FCN model outperformed traditional machine learning methods like Support Vector Machine (SVM), Random Forest, and Adaboosting. Intrusion Detection Systems, or IDSs, are crucial for computer security because they identify and halt malicious activity in computer networks.
An enhanced IDS that incorporates two-level classifier ensembles and hybrid feature selection is presented in [19]. The hybrid feature selection method uses the Ant Colony Algorithm, Genetic Algorithm, and Particle Swarm Optimization to minimize the feature size of training datasets (UNSW-NB15 and NSL-KDD). They selected features based on how well a Reduced Error Pruning Tree (REPT) classifier performs in classification. REPT is a two-level classifier ensemble that uses bagging meta-learners and rotation forests. The suggested classifier outperformed recently developed classification algorithms, displaying 85.8% accuracy, 86.8% sensitivity, and an 88.0% detection rate on the NSL-KDD dataset. The UNSW-NB15 dataset showed similar improvements. ResWearchers further validated the experimental results using a two-step statistical significance test, which improved the usefulness of the suggested classifier in the ongoing IDS study.
The intelligent Intrusion Detection System described in [20] dramatically enhanced detection performance using an Artificial Neural Network (ANN) model. The ANN-based classifier showed 100% sensitivity on the test dataset with high precision in minimizing false positives. The authors [20] integrated an offline method with a current dataset to find patterns in web application attacks. Aims for future studies include applying this method to the most recent dataset from the Canadian Institute for Cybersecurity (CI-RA-CIC-DoHBrw-2020), implementing further real-time optimizations, and integrating it into an online Intrusion Detection System for testing with real-time network data.
The performance of machine learning techniques, such as Board Learning System (BLS), Radial Basis Function Network (RBF-BLS), and BLS with cascades of mapped features and enhancement nodes, was assessed. The scrutiny encompassed an in-depth examination of malicious intrusions and anomalies within communication networks. The studies used subsets of the CICIDS 2017 and CSECIC-IDS 2018 datasets containing DoS attacks. Even with fewer pertinent features, the models’ performances were comparable, as shown in [21]. Most models attained accuracies and F-Scores above 90%, even though more significant numbers of mapped features and enhancement nodes increased memory needs and training times. Compared to non-incremental BLS, incremental BLS had a noticeably reduced training time [21].
In [22], a thorough examination of deep learning techniques for intrusion detection was performed. Seven deep learning models—deep neural networks, convolutional neural networks, restricted Boltzmann machines, recurrent neural networks, deep belief networks, deep Boltzmann machines, and deep Auto Encoders—were evaluated for their binary and multiclass classification performances. The CSE-CIC-IDS 2018 and Tensorflow systems were used as the software library and benchmark dataset, respectively.
The evaluation relied upon critical performance indicators such as accuracy, detection, and false alarm rates [23], comprehensively assessing the system’s capabilities. The outcomes indicate that deep neural networks can detect intrusion in software-defined networks. This approach demonstrated good results on the latest network attack datasets, leveraging the specific characteristics of Software Defined Network (SDN), such as using the network as a sensor through OpenFlow. Utilizing machine learning techniques for traffic analysis in SDN can enhance the efficiency of network resource allocation. Future work will focus on further analysis of intrusion detection methods and developing an Intrusion Detection System (IDS) using machine learning methods [23].
In ref. [24], the authors developed an ID system using Spark. Moreover, the Convolutional Autoencoder (Conv-AE) deep learning approach efficiently learned feature representations from the CICIDS 2018 dataset. The proposed system outperformed traditional security approaches regarding attack detection rate, accuracy, and computation complexity. The research suggests [24] that this approach can be extended to other fields, such as real-time anomaly and network misuse detection, using deep learning as an attribute extraction tool.
In order to detect network threats, Ref. [25] suggested the use of three models (LSTM, Apache et al., and Random Forest) that use the random forest approach to reduce dimensionality. The application of oversampling and undersampling strategies stemmed from the dataset’s inherent imbalance. Apache Spark had the fastest training time across all classes compared to DL models. Future work could involve semisupervised learning and expand beyond signature-based Intrusion Detection Systems.
One study reported that the Long Short-Term Memory (LSTM) algorithm attained a maximum accuracy of 99 percent [26].
Modern cybersecurity requires detecting network breaches, and researchers have used machine learning to detect and stop attacks [27]. In network intrusion detection, more attention has recently been drawn to deep learning. However, present evaluations utilizing outdated datasets must address class imbalance issues, which may produce biased conclusions. In [27], the authors offer insightful information by resolving this imbalance and assessing Deep Neural Networks, Convolutional Neural Networks, and Long Short-Term Memory Networks on a balanced dataset. With the lowest False Alarm Rate (2.615%), F1-Score (83.799%), and excellent accuracy (84.312%), the Deep Neural Network demonstrated the most outstanding performance [27].
Network Intrusion Detection Systems rely on efficient algorithms and methods for real-time assault detection [28]. One study proposed a preprocessing technique that significantly reduced traffic analysis time and achieved high success rates. Experimental results showed that the ExtraTree algorithm obtained a 99.0% detection rate for binary classification and an 82.96% reduction in processing time per sample. With a 64.43% decrease in processing time per sample, the Random Forest method produced a 98.5% identification rate in the multiclass detection scenario. These results indicate similar categorization rates to previous research, albeit with much shorter test durations [28].
Video Production Service