Automatic Visual Pollution Detection: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Contributor: , , ,

Visual pollution, characterized by disorderly and displeasing urban environments, is inherently subjective and challenging to quantify precisely. In recent years, substantial research efforts have been initiated to identify and categorize various forms of visual pollution by applying artificial intelligence and computer vision techniques. The automated recognition of visual disturbances using advanced deep learning methods can aid governmental bodies and relevant authorities in taking proactive measures. 

  • artificial intelligence
  • deep learning
  • EfficientDet
  • Object Detection
  • microcontroller

1. Introduction

Our environment is undergoing significant transformations due to the amalgamation of detrimental elements and the advancement of human civilization [1]. With more people congregating in cities, towns, and rural areas than ever before, various human-created pollutants are surreptitiously filling the environment due to these activities [2]. Beyond their environmental impact, these contaminants can also have adverse effects on our physical and mental well-being. At times, we encounter these contaminants inadvertently, causing disruptions to our visual and aesthetic experiences [3]. These unfamiliar and unpleasing visual elements are collectively known as visual pollutants, contributing to the occurrence of visual pollution. Visual pollution results from several environmental factors [4] and can manifest in various forms, including extensive billboards, wind and nuclear power facilities, electrical lines, industrial structures, construction sites, street litter, and roadside billboards displaying advertisements [5]. Despite its glaring presence, visual pollution often receives minimal attention in the face of more conventional forms of environmental contamination [6]. Visual pollution poses a threat to the aesthetics of landscape physiognomy, diminishing the overall environmental appeal. Excessive deployment of numerous unsightly elements in urban environments can result in visual blight and eyesores for city residents [7].
It is estimated that approximately 20% of global water pollution can be attributed to the dyeing and finishing processes involved in textile production [8]. The residual materials generated during the textile manufacturing process are referred to as textile waste [9]. This waste can be generated at various stages of textile production, including spinning, weaving, dyeing, finishing, and even after the final product is formed. Textile waste can be produced unintentionally or intentionally as part of efforts to enhance efficiency [10]. The categorization of textile contaminants has become a subject of increasing interest in recent times. In 2018, World Bank research reported that 242 million tons of plastic waste, constituting 12% of all solid waste, were generated globally in 2016 [11]. By 2050, the world is projected to produce 3.40 billion tons of plastic waste annually, accounting for 12% of total solid waste. This scenario represents a significant increase from the current 2.01 billion tons of waste produced annually. Despite the relatively high prevalence of disease, hazardous working conditions can result in infections affecting the skin, respiratory system, and digestive system [12]. Visual pollution encompasses the adverse effects of pollutants that impair vision and mental health, thereby reducing overall quality of life [13]. This pollution originates from various environmental factors and can manifest in different ways. Examples include large billboards, wind and nuclear power plants, electrical cables, industrial structures, construction sites, street litter, and roadside billboards displaying advertisements [14].
Every day, our efforts are directed toward modernizing the world; however, the consequences of this modernization have a detrimental impact on the environment. Rapidly expanding cities around the globe are becoming cluttered with undesirable visual elements. The pollution resulting from these visible objects, often referred to as visual pollutants, in our surroundings is termed visual pollution. Visual pollutants encompass any objects that are aesthetically displeasing and intrusive to the observer. These can include billboards, tangles of electric wires, street litter, construction materials, graffiti, cellphone towers, and worn-out buildings, as well as instances such as industrial clothes dumps, industrial textile billboards, and industrial textile dye.

2. Automatic Visual Pollution Detection and Identifies Existing Gaps

Ahmed et al. [15] demonstrated the automated detection of a wide range of visual pollution using deep learning convolutional neural networks. The authors classified their data into four categories and employed a convolutional neural network with multiple layers of artificial neurons. The implemented customized CNN model attained a training accuracy of 95% and a validation accuracy of 85%. Andjarsari and her team [16] reported that the presence of billboards and street graffiti along the route could impact the area’s aesthetics and potentially obstruct vision. To investigate visual pollution, the authors employed an AHP-based SBE technique. Furthermore, a combination of SWOT, AHP, and QSPM techniques was used.
Hossain et al. [17] introduced artificial intelligence techniques for identifying visual contaminants using images from Google Street View. The authors selected the different roads of Dhaka, the capital city of Bangladesh, as their test subject due to its recent ranking as one of the world’s most polluted cities. The image dataset was manually curated, with photos collected from various perspectives, focusing on frames containing visual pollution. These images were meticulously tagged using the CVAT framework and utilized for model training. Notably, they leveraged the object detection model YOLOv5 for detection and classification purposes. Yang et al. [18] developed WasNet, a distinctive and lightweight neural network approach for trash classification. This network stands out due to its efficiency, with only 1.5 million parameters on the ImageNet dataset, which is half the parameters of mainstream neural networks. Despite its lightweight nature, it exhibits reasonable performance. It operates at 3 million floating-point operations per second (FLOPs), making it one-third as resource-intensive as other well-known lightweight neural networks. Notably, it achieves a 64.5% accuracy on the ImageNet dataset, an 82.5% accuracy on the Garbage Classification dataset, and an impressive 96.10% accuracy on the TrashNet dataset.
Mittal et al. [19] introduced a novel smartphone app, SpotGarbage, leveraging a deep architecture based on fully convolutional networks for garbage detection. After training on the newly released Garbage In Images (GINI) dataset, the model attained a mean accuracy of 87.69%. Furthermore, they optimize the network architecture, resulting in a significant reduction of 96.8% in prediction time and an 87.9% decrease in memory consumption, all without compromising accuracy. Marin and his team [20] explored three distinct feature extraction schemes for underwater marine debris using six well-established deep convolutional neural networks (CNNs) and other features. They conducted a comprehensive analysis comparing the performance of a neural network (NN) classifier constructed on top of deep CNN feature extractors in various configurations: when the feature extractor is fixed, fine-tuned on the task, fixed during the initial training phase, and fine-tuned afterward. Their findings reveal that the improved Inception-ResNetV2 feature extractor outperforms others, achieving an impressive accuracy of 91.40% and an F1 score of 92.08%.
Tasnim et al. [21] delved into the application of computer vision techniques to develop an innovative approach for the automatic detection and categorization of visual pollutants associated with the textile industry. Their research focused on three categories of textile-based visual pollutants: cloth litter, advertising billboards, signs, and textile dyeing waste materials. Deep learning algorithms, including Faster R-CNN, YOLOv5, and EfficientDet, were employed to classify the collected dataset automatically. Bakar et al. [22] utilized a standard cumulative area technique to address the issue of assessing visual pollution. Using a photo booklet, the authors conducted surveys with respondents in an architectural and urban zone in Kuala Lumpur, Malaysia. The findings of the study, which took demographic factors into account, revealed insights regarding visual pollutants based on the respondents’ varying tolerance levels.
Setiawan et al. [23] utilized the SIFT technique to distinguish between photos of organic and non-organic waste. The input image dimensions were adjusted to meet specific requirements. Finally, the SIFT algorithm achieved an impressive accuracy level of 89%. Ahmed et al. [24] introduced computer vision systems and intelligent cameras or optical sensors for object detection and tracking. They employed SSD (Single Shot MultiBox Detector) and YOLO (You Only Look Once) models for object detection and localization, complemented by visual processing intelligent robotic cameras. Their system demonstrated a remarkable performance, with a maximum valid detection rate ranging from 90% to 93%.
The literature reviews highlight the extensive research conducted on automated visual pollution identification using deep learning techniques, demonstrated in Table 1. However, notable gaps persist in the integration of detecting and classifying visual pollution in both the textile industry and urban streets by leveraging advanced deep learning techniques. Furthermore, most of the articles have not explored the use of modern edge devices, such as the Raspberry Pi 4B, for automated identification processes in real time.
Table 1. A comparative analysis of related works on AI-based visual pollution detection.
A comprehensive analysis of the developed devices is lacking in most existing works. This study, in contrast, delves into a comparative analysis of automatic detection methods using the Raspberry Pi 4B microcontroller and advanced computer-vision-based library functions. The Raspberry Pi 4B microcontroller utilizes the YOLOv5 tiny deep learning approach for object identification, leading to efficient detection outcomes and quicker response times than previous studies. This research extensively investigates the proposed automatic device, providing insights into its accuracy, implementation cost, and other relevant aspects related to detecting and classifying visual pollution in the textile industry and urban streets.

This entry is adapted from the peer-reviewed paper 10.3390/sci6010005

This entry is offline, you can click here to edit this entry!
Video Production Service