Stepping-Stone Intrusion Detection Resistant to Intruders’ Chaff-Perturbation: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Contributor: , , ,

Using stepping-stone intrusion (SSI), the intruder’s identity is very difficult to discover as it is concealed by a long interactive connection chain of hosts. An effective approach for SSI detection (SSID) is to determine how many connections are contained in a connection chain. 

  • connection chain
  • session manipulation
  • chaff-perturbation
  • packet crossover

1. Introduction

Using SSI, an attacker builds a chain of stepping-stone machines (see Figure 1 with five connections), uses SSH or telnet to login in turn to these stepping-stones and then launches the attack [1]. In Figure 1, Host A serves as the attack host, and Host V represents the victim system. The intruder, sitting in front of the attack host, remotely logs in in turn to the stepping-stone hosts S1, S2, S3, and S4, and finally to the victim Host V. To detect SSI, any stepping-stone host between the attacker and the victim could be employed as the sensor machine where a packet sniffer program (e.g., TCPdump etc.) is running to capture network traffic. In Figure 1, researchers assume that Host S2 serves as the sensor host for SSI detection (SSID). The upstream sub-chain is a part of the chain from the intruder machine A to the sensor machine S2, and the downstream sub-chain is the other part of the chain from S2 to the victim machine V.
Figure 1. A sample of a connection chain with five connections.
The goal of SSID is to decide if a stepping-stone is employed by a hacker for a malicious intrusion [2,3,4]. If a connection leaving the sensor machine matches with one of connections arriving at the sensor, then it is likely that the communication session is a malicious intrusion [3,5,6]. SSIs are very hard to detect as the intruder is concealed by a long TCP/IP connection chain of stepping-stone machines. For typical data communication between a server and a client, every interactive connection between them is independent of one another even if the connections might be relayed. Due to such independence, it is extremely difficult to determine the SSI attack origin while the victim machine V is accessed via several stepping-stone hosts.
Today, intruders tend to launch cyberattacks with session manipulation techniques such as chaff attack. Chaff attack is a hacking technique utilized to add intruder-created packets into a communication session to modify not only the packets’ RTTs, but also the total number of packets within a certain period of time. Most known SSID algorithms could easily be defeated by chaff attack. The chaff-attacking technique is widely used in attacks such as man-in-the-middle, DoS, DDoS, or SSI attacks.
One type of SSID approach is to determine whether a connection leaving the sensor host matches with one of connections arriving at the sensor [2,7,8]. This type of SSID approach only uses the sensor host for detection, thus is called the host-based SSID. It is well-known that most Web applications as well as cloud computing applications usually employ stepping-stone hosts to gain access to a remote server such as a database server. Therefore, high false-positive errors are likely unavoidable when a host-based SSID approach is used for detection [4,9,10].

2. An Effective Approach for Stepping-Stone Intrusion Detection Resistant to Intruders’ Chaff-Perturbation via Packet Crossover

In 1995, S. Staniford-Chen et al. [1] proposed the first SSID method by comparing the actual contents of packets to determine whether a relayed pair of connections is present at the sensor host. Ref. [1] claimed that it is likely a malicious intrusion if such a pair of connections exists. However, the SSID approach proposed in [1] does not work if the network traffic is encrypted.
To overcome this problem, a time-based thumbprint approach was developed by Y. Zhang et al. [4] for SSID by comparing the thumbprints created based on the timestamps of network packets captured from the outgoing and incoming connections of the sensor host. Since packets’ timestamps are not encrypted, the time-based thumbprint method proposed in [4] works effectively for encrypted network traffic. A similar solution to solve the problems with S. Staniford-Chen’s SSID method was developed by K. Yoda et al. [16] by analyzing the deviation between two consecutive connections within a connection chain. This SSID method does not require any information in the packets’ contents; thus, it also works when the network traffic is encrypted.
However, none of the above-mentioned SSID methods are resistant to intruders’ session manipulation using chaff-perturbation and/or time-jittering. Research findings obtained by D. L. Donoho et al. [3] show that intruders’ capabilities of manipulating communication sessions are limited, and they would not be able to evade detection by camouflaging the communication sessions.
Another SSID method that does not require information about packets’ contents was developed by A. Blum et al. [2] by counting the number of packets in the outgoing and incoming connections, respectively. Ref. [2] claims that if a pair of relayed connections is present, then the difference between these two numbers of packets (in the outgoing and incoming connections) must be upper bounded. Another host-based SSID method developed by T. He et al. [12] was claimed to be resistant to intruders’ chaff attacks proportional to the network traffic size. More specifically, they claim that if ∆ is an upper bound of the packet delay, for an intruder to evade SSID, the intruder has to chaff at least n/(1 + λ∆) packets, where n is the total number of normal packets before any meaningless packet is chaffed.
The first network-based SSID method was proposed by Yung et al. [14] in 2002. This approach computed a connection chain length by calculating the ratio of the Send-Echo Round-Trip Time (RTT) over the Send-Ack RTT. According to Yung’s method, a Send-Echo RTT reflects the number of connections contained in the downstream sub-chain. whereas a Send-Ack RTT represents the length of one hop connection from the sensor host to its adjacent machine on the downstream side. The problem with Yung’s SSID algorithm is that it produced very high false-negative errors because the acknowledgement packets were used in the chain length estimation. The issues of Yung’s SSID algorithm were discovered and described in [11] by J. Yang et al.
The work [11] was the second network-based SSID algorithm and proposed in 2004. The SSID algorithm using step functions was used to estimate the number of connections contained in a connection chain in the work [11]. This research resolved the issues of Yung’s SSID algorithm in [14] by setting up the connection chain in a different way. With this improvement, each Echo packet could match with a corresponding Send [11]. Thus, the false-negative errors for SSID were significantly reduced in [11], compared to Yung’s SSID algorithm proposed in [14]. Unfortunately, the step-function method proposed in [11] was only performing effectively in a local area network.
To overcome these issues existing in [11], an SSID approach using a data mining method was proposed by Yang et al. in [13]. The packets’ round-trip times were computed by using a data mining algorithm—the maximum–minimum distance clustering algorithm. With this method, every Send packet was accurately matched through looking at all the possible Echoes for this Send. The number of clusters output by the maximum–minimum distance clustering algorithm decides the connection chain length. However, the SSID method proposed in [13] must capture a huge number of TCP packets, which does not make the processing and analysis of the captured packets efficient. Thus, the detection approach developed in [13] is inefficient, taking the packets’ processing time into consideration.
To overcome the issue with the SSID method introduced in [13], a network-based SSID algorithm using packet crossover was proposed by Wang et al. [15]. The number of connections in the downstream sub-chain was calculated by analyzing the packet crossover ratios in [15]. The work [15] also verified that, when the packet crossover ratio gets larger and larger, so does the downstream connection chain length. However, the SSID algorithm proposed using packet crossover in [15] was not resistant to intruders’ chaff-perturbation.
A recent work [17] proposed a method that may obtain context-free properties for installing an anomaly-based NIDS (network intrusion detection system) using a machine learning model.

This entry is adapted from the peer-reviewed paper 10.3390/electronics12183855

This entry is offline, you can click here to edit this entry!
Video Production Service