Lightweight IoT Intrusion Detection Systems: History
Please note this is an old version of this entry, which may differ significantly from the current revision.

Cyber security has become increasingly challenging due to the proliferation of the Internet of Things (IoT), where a massive number of tiny, smart devices push trillion bytes of data to the Internet and is expected to reach 73.1 ZB (zettabytes) by 2025. IoT devices have limited computational capabilities and thus researchers have shifted their focus onto designing lightweight intrusion-detection system (IDS) that can deliver the needed security requirements while operating on those thin devices.

  • internet of things
  • intrusion detection systems

1. Introduction

Intrusion detection is a critical component of security systems for Internet of Things (IoT) security. The proliferation of connected devices and the increasing amount of data being transmitted create opportunities for malicious actors to exploit vulnerabilities [1]. An essential challenge in developing effective intrusion-detection systems for IoT applications is handling large volumes of data while preserving data privacy and minimizing energy consumption [2]. Network nodes experience diverse traffic patterns, causing standalone intrusion-detection system (IDS) nodes to learn only from accessible traffic. This leads to delays in attack detection and potential privacy breaches if collaborative IDS are used, as sensitive information may be shared across nodes.

2. Lightweight IoT Intrusion Detection Systems

2.1. Lightweight IDS for IoT

Cyber security has become increasingly challenging due to the proliferation of the Internet of Things (IoT), where a massive number of tiny, smart devices push trillion bytes of data to the Internet and is expected to reach 73.1 ZB (zettabytes) by 2025 [3]. IoT devices have limited computational capabilities and thus researchers have shifted their focus onto designing lightweight IDS that can deliver the needed security requirements while operating on those thin devices.
Zarpelão et al. [4] surveyed IDS developments for IoT and discovered a growing interest in lightweight IDS. The authors discovered two tracks that claim to be lightweight which are:
  • Signature-based lightweight IDS (such as [5]): this track is beyond the scope of this work.
  • Anomaly-based lightweight IDS: People will focus this work on this research track.
Lee et al. [6] detected 6LowPAN attacks by observing IoT nodes’ reported energy consumption. To deal with energy consumption attacks, Le et al. [7] created a lightweight intrusion-detection system that restricts sensing operations to cluster heads, allowing the remaining nodes to operate normally. This approach is aligned with Reza et al. [8]. Jan et al. [9] concentrated on creating computationally lightweight IDS using support vector machines, supervised machine learning (ML), which does not limit the IDS to a single attack type (as in [6]) nor to the number of nodes running the IDS (such as in [7][8]).
By limiting the number of investigated features, Soe et al. [10] developed a lightweight anomaly-based IDS strategy that selects the features with the highest gain ratio and discards all others, thus reducing the amount of computation required. It is worth noting that this strategy runs the risk of missing out on rare attacks that can only be detected using discarded features. This method is consistent with that proposed by Davahli et al. [11], where feature selection is based on the hybridization of a genetic algorithm (GA) and the Grey Wolf Optimizer (GWO).
Khater et al. [12] combined the last two strategies (feature reduction and supervised deep learning) to enhance the communication security of lightweight IoT devices in a Fog computing environment. To maintain the lightweight criteria, a combination of Modified Vector Space Representation (MVSR) N-gram (1-gram and 2-gram) were used for system call encoding in the feature extraction phase while using a sparse matrix for space reduction. Then, the extracted features were fed into a Multilayer Perceptron (MLP) model with a single hidden layer that would classify the nature of the network traffic.
Instead of being selective on the features (such as in [10]) or on nodes (such as in [7][8]), Sedjelmaci et al. [13] proposed a strategy that is selective on time. The authors proposed a game-theoretic approach for identifying the times when the attacks are most probably going to happen. Only then, the IDS functionality is enabled.
Deep Neural Networks (DNN) were applied in the hope of improving the detection accuracy of lightweight IDS. One of the most recent applications is “Realguard” by Nguyen et al. [14], a DNN-based IDS that implements a simple MLP with 5 hidden layers. Realguard can run on low-end IoT devices while achieving high attack detection accuracy.

2.2. Sampling Algorithms for IDS

IoT devices cannot handle all sent data due to rising network overhead and stagnant power storage capacity. Researchers have turned to sampling methods before data analysis to mitigate this, reducing data volume. This approach must prevent information loss to avoid compromising threat detection accuracy. Sampling techniques are designed to optimize IDS efficiency and attack detection accuracy. IoT nodes sample packets, creating a subset of network traffic for subsequent analysis and detection. The success of a sampling method depends heavily on factors such as the sampling rate and the chosen strategy.
A network-based IDS (NIDS) analyzes data samples as network packets. Thus, the population is all packets in the network traffic, whereas the subset is a selection. Since only a specific number of packets are taken for analysis, the essential parameter is the sampling rate, or sampling ratio, which determines the ultimate size of the subset compared to the original population. Some sampling algorithms may produce an incorrect sample size. Static and dynamic sampling algorithms exist. A static sampling process is conducted periodically or randomly following a given rule or data interval. People can classify those rules under three main categories of sampling decisions: count-based, time-based, and content-based. Every static algorithm that samples data based on its ordering position in a stream of packets is identified as count-based. A time-based algorithm focused on the arrival time of a packet (timestamp). Finally, content-based sampling methods analyze the content of the packet before data selection. As this final method increases the overhead and computation time, content-based algorithms, known as well as filtering algorithms, are beyond the scope of the research. The main advantage of using a static sampling algorithm would be reducing bandwidth and storage requirements, as only a subset is detained for anomaly detection analysis. In their turn, dynamic or adaptive sampling algorithms use different sampling intervals and/or rules for data sample decisions.
In this context, several studies have looked at the effects of data sampling. Mai et al. [15] investigated, using various sampling algorithms, the effect of sampling high-speed IP-backbone network traffic on intrusion-detection outcomes, specifically port scans and volume anomaly detection. Roudiere et al. [16] tested the accuracy of the “Autonomous Algorithm for Traffic Anomaly Characterization” detector in detecting DDoS attacks over sampled traffic. Various sampling policies were used to sample the traffic. The authors of [17][18] investigated how packet sampling influenced anomaly detection results. Silva et al. [19] proposed a framework for evaluating packet sampling’s effects. They examined the effectiveness of each sampling algorithm and proposed a set of metrics for assessing each sampling technique’s ability to produce a representative sample of the original traffic. Bartos et al. [20] investigated the impact of traffic sampling on anomaly identification and presented a new adaptive flow-level sampling algorithm to improve the sampling process’ accuracy. Using traces containing the Blaster worm, Brauckhoff et al. [21] assessed the accuracy of existing anomaly detection and data sampling algorithms. Liu et al. [22] implemented a novel Difficult Set Sampling Technique (DSSTE) to tackle the class imbalance problem which helped in the detection of rare attacks. They used Edited Nearest Neighbor (ENN) to identify the difficult set then applied the K-means algorithm to compress the majority in the difficult set, and finally augmented the data of the clusters to obtain the final sample. A more thorough discussion can be seen in the previous survey [23] and benchmarking [24] works where people investigated all data sampling strategies, their impact on detecting various attacks, and the behavior and robustness of features under various sampling strategies. People also looked at how the estimation of network features varies depending on the sampling method, sample size, and other factors, and how this affects statistical inference from these data.

This entry is adapted from the peer-reviewed paper 10.3390/s23167038

References

  1. Huč, A.; Šalej, J.; Trebar, M. Analysis of machine learning algorithms for anomaly detection on edge devices. Sensors 2021, 21, 4946.
  2. Tekin, N.; Acar, A.; Aris, A.; Uluagac, A.S.; Gungor, V.C. Energy consumption of on-device machine learning models for IoT intrusion detection. Internet Things 2023, 21, 100670.
  3. Internet of Things Statistics for 2023—Taking Things Apart. Available online: https://dataprot.net/statistics/iot-statistics/ (accessed on 21 March 2023).
  4. Zarpelão, B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of intrusion detection in Internet of Things. J. Netw. Comput. Appl. 2017, 84, 25–37.
  5. Oh, D.; Kim, D.; Ro, W.W. A malicious pattern detection engine for embedded security systems in the Internet of Things. Sensors 2014, 14, 24188–24211.
  6. Lee, T.H.; Wen, C.H.; Chang, L.H.; Chiang, H.S.; Hsieh, M.C. A lightweight intrusion detection scheme based on energy consumption analysis in 6LowPAN. In Advanced Technologies, Embedded and Multimedia for Human-Centric Computing; Springer: Berlin/Heidelberg, Germany, 2014; pp. 1205–1213.
  7. Le, A.; Loo, J.; Chai, K.K.; Aiash, M. A specification-based IDS for detecting attacks on RPL-based network topology. Information 2016, 7, 25.
  8. Raza, S.; Wallgren, L.; Voigt, T. SVELTE: Real-time intrusion detection in the Internet of Things. Hoc Netw. 2013, 11, 2661–2674.
  9. Jan, S.U.; Ahmed, S.; Shakhov, V.; Koo, I. Toward a lightweight intrusion detection system for the internet of things. IEEE Access 2019, 7, 42450–42471.
  10. Soe, Y.N.; Feng, Y.; Santosa, P.I.; Hartanto, R.; Sakurai, K. Towards a lightweight detection system for cyber attacks in the IoT environment using corresponding features. Electronics 2020, 9, 144.
  11. Davahli, A.; Shamsi, M.; Abaei, G. Hybridizing genetic algorithm and grey wolf optimizer to advance an intelligent and lightweight intrusion detection system for IoT wireless networks. J. Ambient. Intell. Humaniz. Comput. 2020, 11, 5581–5609.
  12. Khater, B.S.; Abdul Wahab, A.W.; Idris, M.Y.I.; Hussain, M.A.; Ibrahim, A.A.; Amin, M.A.; Shehadeh, H.A. Classifier performance evaluation for lightweight IDS using fog computing in IoT security. Electronics 2021, 10, 1633.
  13. Sedjelmaci, H.; Senouci, S.M.; Al-Bahri, M. A lightweight anomaly detection technique for low-resource IoT devices: A game-theoretic methodology. In Proceedings of the 2016 IEEE International Conference on Communications (ICC), Kuala Lumpur, Malaysia, 22–27 May 2016; pp. 1–6.
  14. Nguyen, X.H.; Nguyen, X.D.; Huynh, H.H.; Le, K.H. Realguard: A lightweight network intrusion detection system for IoT gateways. Sensors 2022, 22, 432.
  15. Mai, J.; Chuah, C.N.; Sridharan, A.; Ye, T.; Zang, H. Is sampled data sufficient for anomaly detection? In Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, Rio de Janeriro, Brazil, 25–27 October 2006; pp. 165–176.
  16. Roudière, G.; Owezarski, P. Evaluating the Impact of Traffic Sampling on AATAC’s DDoS Detection. In Proceedings of the 2018 Workshop on Traffic Measurements for Cybersecurity, Budapest, Hungary, 20 August 2018; pp. 27–32.
  17. Pescapé, A.; Rossi, D.; Tammaro, D.; Valenti, S. On the impact of sampling on traffic monitoring and analysis. In Proceedings of the 2010 22nd International Teletraffic Congress (lTC 22), Amsterdam, The Netherlands, 7–9 September 2010; pp. 1–8.
  18. Zhang, H.; Liu, J.; Zhou, W.; Zhang, S. Sampling method in traffic logs analyzing. In Proceedings of the 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, China, 27–28 August 2016; Volume 1, pp. 554–558.
  19. Silva, J.M.C.; Carvalho, P.; Lima, S.R. A modular sampling framework for flexible traffic analysis. In Proceedings of the 2015 23rd International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 16–18 September 2015; pp. 200–204.
  20. Bartos, K.; Rehak, M.; Krmicek, V. Optimizing flow sampling for network anomaly detection. In Proceedings of the 2011 7th International Wireless Communications and Mobile Computing Conference, Istanbul, Turkey, 4–8 July 2011; pp. 1304–1309.
  21. Brauckhoff, D.; Tellenbach, B.; Wagner, A.; May, M.; Lakhina, A. Impact of packet sampling on anomaly detection metrics. In Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, Rio de Janeriro, Brazil, 25–27 October 2006; pp. 159–164.
  22. Liu, L.; Wang, P.; Lin, J.; Liu, L. Intrusion detection of imbalanced network traffic based on machine learning and deep learning. IEEE Access 2020, 9, 7550–7563.
  23. Hajj, S.; El Sibai, R.; Bou Abdo, J.; Demerjian, J.; Makhoul, A.; Guyeux, C. Anomaly-based intrusion detection systems: The requirements, methods, measurements, and datasets. Trans. Emerg. Telecommun. Technol. 2021, 32, e4240.
  24. Hajj, S.; El Sibai, R.; Bou Abdo, J.; Demerjian, J.; Guyeux, C.; Makhoul, A.; Ginhac, D. A critical review on the implementation of static data sampling techniques to detect network attacks. IEEE Access 2021, 9, 138903–138938.
More
This entry is offline, you can click here to edit this entry!
Video Production Service