Adversarial Attacks in Camera-Based Vision Systems

Adversarial Attacks in Camera-Based Vision Systems: Comparison

Please note this is a comparison between Version 1 by Amira Guesmi and Version 2 by Lindsay Dong.

Vision-based perception modules are increasingly deployed in many applications, especially autonomous vehicles and intelligent robots. These modules are being used to acquire information about the surroundings and identify obstacles. Hence, accurate detection and classification are essential to reach appropriate decisions and take appropriate and safe actions at all times. Adversarial attacks can be categorized into digital and physical attacks.

adversarial machine learning
physical adversarial attack
Digital Adversarial Attacks
security
autonomous vehicles
camera-based vision systems
classification
intellegent robots

1. Introduction

The revolutionary emergence of deep learning (DL) has shown a profound impact across diverse sectors, particularly in the realm of autonomous driving [1]. Prominent players in the automotive industry, such as Google, Audi, BMW, and Tesla, are actively harnessing this cutting-edge technology in conjunction with cost-effective cameras to develop autonomous vehicles (AVs). These AVs are equipped with state-of-the-art vision-based perception modules, empowering them to navigate real-life scenarios even under high-pressure circumstances, make informed decisions, and execute safe and appropriate actions.

Consequently, the demand for autonomous vehicles has soared, leading to substantial growth in the AV market. Strategic Market Research (SMR) predicts that the autonomous vehicle market will achieve an astonishing valuation of $196.97 billion by 2030, showcasing an impressive compound annual growth rate (CAGR) of 25.7% (ACMS). The integration of DL-powered vision-based perception modules has undeniably accelerated the progress of autonomous driving technology, heralding a transformative era in the automotive industry. With the increasing prevalence of AVs, their potential impact on road safety, transportation efficiency, and overall user experience remains a subject of great interest to consumers, researchers, and investors alike.

However, despite the significant advancements in deep learning models, they are not immune to adversarial attacks, which can pose serious threats to their integrity and reliability. Adversarial attacks involve manipulating the input of a deep learning classifier by introducing carefully crafted perturbations, strategically chosen by malicious actors, to force the classifier into producing incorrect outputs. Such vulnerabilities can be exploited by attackers to compromise the security and integrity of the system, potentially endangering the safety of individuals interacting with it. For instance, a malicious actor could add adversarial noise to a stop sign, causing an autonomous vehicle to misclassify it as a speed limit sign ^[2][3][2,3]. This kind of misclassification could lead to dangerous consequences, including accidents and loss of life. Notably, adversarial examples have been shown to be effective in real-world conditions [4]. Even when printed out, an image specifically crafted to be adversarial can retain its adversarial properties under different lighting conditions and orientations.

Therefore, it becomes crucial to understand and mitigate these adversarial attacks to ensure the development of safe and trustworthy intelligent systems. Taking measures to defend against such attacks is imperative for maintaining the reliability and security of deep learning models, particularly in critical applications such as autonomous vehicles, robotics, and other intelligent systems that interact with people.

Adversarial attacks can broadly be categorized into two types: Digital Attacks and Physical Attacks, each distinguished by its unique form of attack ^[3][4][5][3,4,5]. In a Digital Attack, the adversary introduces imperceptible perturbations to the digital input image, specifically tailored to deceive a given deep neural network (DNN) model. These perturbations are carefully optimized to remain unnoticed by human observers. During the generation process, the attacker works within a predefined noise budget, ensuring that the perturbations do not exceed a certain magnitude to maintain imperceptibility. In contrast, Physical Attacks involve crafting adversarial perturbations that can be translated into the physical world. These physical perturbations are then deployed in the scene captured by the victim DNN model. Unlike digital attacks, physical attacks are not bound by noise magnitude constraints. Instead, they are primarily constrained by location and printability factors, aiming to generate perturbations that can be effectively printed and placed in real-world settings without arousing suspicion.

The primary objective of an adversarial attack and its relevance in real-world scenarios is to remain inconspicuous, appearing common and plausible rather than overtly hostile. Many previous works in developing adversarial patches for image classification have focused mainly on maximizing attack performance and enhancing the strength of adversarial noise. However, this approach often results in conspicuous patches that are easily recognizable by human observers. Another line of research has aimed to improve the stealthiness of the added perturbations by making them blend seamlessly into natural styles that appear legitimate to human observers. Examples include camouflaging the perturbations as color films [6], shadows [7], or laser beams [8], among others.

Figure 1 provides a visual comparison of AdvRain with existing physical attacks. While all the adversarial examples in Figure 1 successfully attack deep neural networks (DNNs), AdvRain stands out in its ability to generate adversarial perturbations with natural blurring marks (emulating a similar phenomenon to the actual rain), unlike the conspicuous pattern generated by AdvPatch or the unrealistic patterns generated by FakeWeather [9]. This showcases the effectiveness of AdvRain in creating adversarial perturbations that blend in with the surrounding environment, making them difficult for human observers to detect.

Figure 1. AdvRain vs. existing physical-world attacks: (a) AdvCF [7], (b) Shadow [6], (c) advPatch [10], (d) AdvCam [8], (e) FakeWeather [9], and (f) AdvRain.

2. Adversarial Attacks in Camera-Based Vision Systems

Environment perception has emerged as a crucial application, driving significant efforts in both the industry and research community. The focus on developing powerful deep learning (DL)-based solutions has been particularly evident in applications such as autonomous robots and intelligent transportation systems. Designing reliable recognition systems is among the primary challenges in achieving robust environment perception. In this context, automotive cameras play a pivotal role, along with associated perception models such as object detectors and image classifiers, forming the foundation for vision-based perception modules. These modules are instrumental in gathering essential information about the environment, aiding autonomous vehicles (AVs) in making critical decisions for safe driving.

The pursuit of dependable recognition systems represents a major hurdle in establishing high-performance environment perception. The accuracy and reliability of such systems are critical for the safe operation of AVs and ensuring their successful integration into real-world scenarios. However, it has been demonstrated that these recognition systems are susceptible to adversarial attacks, which can undermine their integrity and pose potential risks to AVs and their passengers. In light of these challenges, addressing the vulnerabilities of DL-based environment perception systems to adversarial attacks becomes paramount. Research and development efforts must focus on building robust defense mechanisms to fortify these systems against potential threats, enabling the safe and trustworthy deployment of AVs and intelligent transportation systems in the future.

Digital Adversarial Attacks

In scenarios involving digital attacks, the attacker has the flexibility to manipulate the input image of a victim deep neural network (DNN) at the pixel level. These attacks assume that the attacker has access to the DNN’s input system, such as a camera or other means of providing input data. The concept of adversarial examples, where a small, imperceptible noise is injected to shift the model’s prediction towards the wrong class, was first introduced by Szegedy et al. ^[11][12]. Over time, various algorithms for creating adversarial examples have been developed, leading to the advancement of digital attacks. Some notable digital attacks include Carlini and Wagner’s attack (CW) [5], Fast Gradient Sign Method (FGSM) [3], Basic Iterative Method (BIM) ^[12][13], local search attacks ^[13][14], and HopSkipJump attack (HSJ) ^[14][15], among others.

Physical Adversarial Attacks

A physical attack involves adding perturbations in the physical space to deceive the target model. The process of crafting a physical perturbation typically involves two main steps. First, the adversary generates an adversarial perturbation in the digital space. Then, the goal is to reproduce this perturbation in the physical space, where it can be perceived by sensors such as cameras and radars, effectively fooling the target model. Existing methods for adding adversarial perturbations in different locations can be categorized into four main groups: Attack by Directly Modifying the Targeted Object: In this approach, the attacker directly modifies the targeted object to introduce the adversarial perturbation. For example, adversarial clothing has been proposed, where clothing patterns are designed to confuse object detectors ^[15][16][17][16,17,18]. Hu et al. ^[15][16] leverage pretrained GAN models to generate realistic/naturalistic images that can be printed on t-shirts and are capable of hiding the person wearing them. Guesmi et al. ^[16][17] proposed replacing the GAN with a semantic constraint based on adding a similarity term to the loss function and, in doing so, directly manipulating the pixels of the image. This results in a higher flexibility to incorporate multiple transformations. Attack by Modifying the Background: Adversarial patches represent a specific category of adversarial perturbations designed to manipulate localized regions within an image, aiming to mislead classification models. These attacks leverage the model’s sensitivity to local alterations, intending to introduce subtle changes that have a substantial impact on the model’s predictions. By exploiting the model’s reliance on specific image features or patterns, adversaries can create patches that trick the model into misclassifying the image or perceiving it differently from its actual content. An example of a practical attack for real-world scenarios is AdvPatch [10]. This attack creates universal patches that can be applied anywhere. Additionally, the attack incorporates Expectation over Transformation (EOT) ^[18][19] to enhance the robustness of the adversarial patch. The AdvCam technique [8] presents an innovative approach to image perturbation, operating within the style space. This method combines principles from neural style transfer and adversarial attacks to craft adversarial perturbations that seamlessly blend into an image’s visual style. For instance, AdvCam can introduce perturbations such as rust-like spots on a stop sign, making them appear natural and inconspicuous within their surroundings. Modifying the Camera: This method involves modifying the camera itself to introduce the adversarial perturbation. One approach is to leverage the Rolling Shutter Effect, where the timing of capturing different parts of the image is manipulated to create perturbations ^[19][20][20,21]. Modifying the Medium Between the Camera and the Object: This category includes attacks that modify the medium between the camera and the object. For instance, light-based attacks use external light sources to create perturbations that are captured by the camera and mislead the target model ^[6][21][6,22]. The Object Physical Adversarial Device (OPAD) ^[21][22] employs structured lighting methods to alter the appearance of a targeted object. This attack system is composed of a cost-effective projector, a camera, and a computer, enabling the manipulation of real-world objects in a single shot. Zhong et al. [6] harness the natural phenomenon of shadows to create adversarial examples. This method is designed to be practical in both digital and physical contexts. Unlike traditional gradient-based optimization algorithms, it employs optimization strategies grounded in particle swarm optimization (PSO) ^[22][23]. The researchers conducted extensive assessments in both simulated and real-world scenarios, revealing the potential threat posed by shadows as a viable avenue for attacks. However, it is important to note that these techniques may experience reduced effectiveness under varying lighting conditions. The Adversarial Color Film (AdvCF) method, introduced by Zhang et al. [7], utilizes a color film positioned between the camera lens and the subject of interest to enable effective physical adversarial attacks. By adjusting the physical characteristics of the color film without altering the appearance of the target object, AdvCF aims to create adversarial perturbations that maintain their effectiveness in various lighting conditions, including both daytime and nighttime settings. FakeWeather [9] attack aims to emulate the effects of various weather conditions, such as rain, snow, and hail, on camera lenses. This attack seeks to deceive computer vision systems, particularly those used in autonomous vehicles and other image-based applications, by adding perturbations to the captured images that mimic the distortions caused by adverse weather. In the FakeWeather attack, the adversary designs specific masks or patterns that simulate the visual artifacts produced by different weather conditions. These masks are then applied to the camera’s images, introducing distortions that can mislead image recognition models. The goal is to make the images appear as if they were captured in inclement weather, potentially causing the models to make incorrect predictions or classifications. One limitation of the FakeWeather attack is that the generated noise or perturbations may have unrealistic and pixelated patterns, which could potentially be detected by more robust image recognition systems. Additionally, the attack’s effectiveness may be limited to specific scenarios and image sizes, as it was initially tested on small images of 32 × 32 pixels from the CIFAR-10 dataset. Additionally, researchers have explored techniques to create adversarial perturbations with natural styles to ensure stealthiness and legitimacy to human observers. Such approaches aim to make the perturbations appear as natural phenomena in the scene.