Object Detection for Small Water Floater

Object Detection for Small Water Floater: History

View Latest Version

Please note this is an old version of this entry, which may differ significantly from the current revision.

Subjects: Water Resources

Contributor:

Fuxun Chen

Lanxin Zhang

Siyu Kang

Lutong Chen

Honghong Dong

Dan Li

Xiaozhu Wu

物体检测是无人机（UAV）任务中使用最广泛的应用之一。由于像素值有限和背景噪声的干扰，检测无人机图像中的小物体仍然是一个持续的挑战。

small objects
object detection
improved YOLOv5
water surface floaters
UAV images

1. Introduction

Water is the source of life, and oceans and rivers cover about 71% of the Earth’s area [1]. Rivers constitute a vital component of the global water cycle and serve a crucial function in facilitating the transfer [2]. However, recent global economic expansion and urbanization have caused significant damage to the natural environment [3] and severe water pollution [4]. Therefore, real-time monitoring of water environments has become crucial due to the ongoing deterioration of water quality. At present, the water’s surface is susceptible to problems, such as exposure or dim light due to the intensity of sunlight, which results in less information about the target features and difficult recognition results.

Object detection is one of the most widely used applications in UAV missions. Due to the angle and flight height of UAV photography, small objects account for a large proportion of UAV images compared to general scenes [5]. Small object detection has been one of the key difficulties in the field of object detection [6] with problems, such as invalid image feature information and blurred object features. To solve this problem, many scholars have proposed solutions. Xia et al. proposed an automated driving system (ADS) data acquisition and analytics platform for vehicle trajectory extraction, reconstruction, and evaluation. In addition to collecting various sensor data, the platform can also use deep learning to detect small targets, such as vehicle trajectories [7]. Liu et al. proposed a model for tassel detection in maize, named yolov5-tassel. The authors trained the yolov5-tassel model based on UAV remote sensing images and achieved an mAP value of 44.7%, which is an improvement compared to FCOS, RetinaNet, and YOLOv5 in terms of detecting small tassels in maize [8]. Liu et al. proposed multibranch parallel feature pyramid networks (MPFPN) to detect small objects under UAV images. The MPFPN model applies the SSAM attention module [9] to attenuate the effect of background noise and uses cascade architecture in the Fast R-CNN stage to achieve a more powerful localization [10]. Chen et al. proposed small object detection networks based on a classification-oriented super-resolution generative adversarial network (CSRGAN), which is a model that adds classification branches and introduces classification losses into a typical SRGAN [11]. The experimental results demonstrate that CSRGAN outperforms VGG16 in classification [12]. In this study, UAV technology and deep learning are applied to small object detection at the same time. However, the water’s surface is susceptible to problems, such as exposure or dim light due to the intensity of sunlight, which results in less information about the water surface floater’s features and difficult recognition results. Deep learning models also face some problems when detecting a small water floater, such as small water surface floaters losing some of their features in the process of down-sampling, which leads to a lack of ability to extract features from global and low detection accuracy and poor feature recognition of a small water surface floater. Therefore, the direct application of the original YOLOv5 to UAV image object detection is not very effective. At the same time, UAV is also known as a powerful complement to the conventional water environment and assessment and has been gradually applied to water environment detection. So, this study proposes an efficient and accurate method based on YOLOV5 for the detection of small water surface floaters in UAV-captured images. The model can effectively locate the water surface floater and thus assist in the water surface floater salvage work.

2. Object Detection

计算机视觉研究主要涉及图像分类、对象检测、对象跟踪、语义分割和实例分割。物体检测是计算机视觉领域最基本和最具挑战性的任务之一[13]。探索高效的实时目标检测模型是近年来研究的热点[14]。传统的目标检测方法通常包括区域建议、特征提取、特征融合和分类器训练，所有这些都需要费力的手工制作对象特征，必须一步一步地完成[15]。因此，传统的目标检测方法存在冗余量大、计算耗时、精度低等缺点。随着深度学习理论的快速发展，深度学习在计算机视觉领域取得了长足的进步，如图像分类和目标检测[16]。从那时起，物体检测进入了一个新的阶段。在2012年的ImageNet竞赛中，与传统方法相比，A. Krizhevsky等人使用卷积神经网络（CNN）[17]使图像分类结果有了很大的改善。CNN通过区域建议网络（RPN）[18]以低成本给出区域建议，可以显著提高目标检测的效率。2014年，R. Girshick等人[19]首次在目标检测领域使用了区域卷积神经网络。检测结果有了很大的改善。与传统方法相比，这种端到端网络更受欢迎，因为它减少了复杂的步骤，例如数据预处理和对象特征的手动设计[20]。从那时起，深度学习开始在目标检测中迅速发展，并在实践中得到广泛应用[21]。

主流深度学习目标检测算法主要有两大类：一类是一级目标检测器，如YOLO系列、SSD [22]、RetinaNet [23]、FCOS[24]等。第二种是两级目标检测器，如R-CNN [25]、Fast R-CNN [26]、Faster R-CNN [27]、Mask R-CNN [28]等。例如，两级目标检测器Faster R-CNN在Fast R-CNN之上集成了RPN（区域建议网络）。Faster R-CNN主要包括共享卷积层模块、RPN和Fast R-CNN检测器。但由于两级目标检测器的预测帧数量多，计算工作量大，检测速度慢，不适合实时检测任务。2015年，J. Redmon等人提出了单级目标检测器YouOnly Look Once （YOLO）[29]，直接获取目标的类和边界框信息，大大提高了目标检测速度。Ma等人提出了一种改进的YOLOv3[30]，具有更强的抗噪性和更好的泛化能力。深度学习在水面漂浮物的检测中也有很好的应用。Lieshoutet等人使用从印度尼西亚雅加达五座不同水道桥收集的视频构建了一个塑料漂浮碎片监测数据集，并使用Faster R-CNN首先检测可能包含塑料漂浮物的区域，然后根据COCO [2]预先训练的Inception V31 [32]，以确定这些区域是否用塑料填充[33].Li等人使用安装在无人船上的相机获取了南中国海的海洋图像，并使用基于YOLOv3的融合模型与DenseNet [34]来检测海面物体[35]。Stofa 等人选择了 DenseNet 模型来检测遥感图像中的船舶，并通过微调超参数确定批量大小为 16，学习率为 0.0001。该模型使用 Adam 优化器 [36] 并优化了参数设置 [37]。相比之下，YOLOv5 [38] 摒弃了候选帧生成阶段，直接对对象进行分类和回归操作，提高了目标检测算法的实时检测速度，YOLOv5 的模型复杂度比 YOLOv10 [4] 降低了约 39%。与其他目标检测模型相比，YOLOv5具有更强的泛化能力和更轻的网络结构。而且，一般来说，无人机平台的计算资源有限，必须考虑模型对目标检测速度的影响。与两级物体检测器相比，一级分段目标检测器更适合无人机平台中的目标检测。因此，最终选择YOLOv5模型作为水面漂浮物小物体检测的原始模型。

小物体通常具有较低的分辨率和不太明显的特征。因此，实现对小物体的精确检测是物体检测领域的热点问题。许多学者在小物体检测方面进行了大量的研究工作。Kim等人插入了高分辨率处理模块（HRPM）和S形融合模块（SFM），不仅降低了计算复杂度，而且提高了小目标的检测精度。他们在无人机侦察图像和小型车辆中获得了良好的检测结果[40]。Wang等人提出了一种名为BANet的双向注意力网络，解决了对小目标和多目标检测不准确、效率低下的问题。与YOLOX相比，该模型在VOC0数据集上的AP改善了55.2-93.2012%，在MSCOCO0数据集上实现了AP改善3.1-01.2017%[41]。Yang等人提出了QueryDet，它使用一种称为级联稀疏查询（CSQ）的新查询机制来加速推理并使用稀疏查询计算检测结果。该模型通过避免背景区域中的冗余计算来获得高分辨率特征图。QueryDet应用于FCOS和Faster-R CNN，并在COCO数据集和visDrone数据集上针对小物体进行了测试，在准确性和推理速度方面取得了比原始算法更好的结果[42]。Liu等人通过改变ROI对齐方法解决了小目标识别精度低的问题，降低了Faster R-CNN的量化误差，与原始模型相比，其精度提高了7%[43]。

3. 无人机捕获图像中的目标检测

无人机已广泛应用于农业、林业、电力、大气探测、测绘等行业。与传统方法相比，无人机[44]具有灵活性和机动性、高效节能、结果多样化和运行成本低等优点。Lan等人结合无人机技术和深度学习，利用Swin-T YOLOX轻量级模型完成了果园患病植物和异常区域的检测。此外，与原始YOLOVX相比，改进后的模型的检测准确率为1.9%[45]。Bajić等人使用改进的YOLOV5进行基于无人机热成像的未爆炸战争遗留物的物体检测，所有90个探测对象的准确率均在11%以上[46]。Liang等人提出了一种基于无人机的低空遥感头盔检测系统，将小目标检测的AP值提高到88.7%，显著提高了网络对小物体的检测性能[47]。Wang等基于YOLOX-nano网络构建了增强型CSPDarknet（ECSP）和加权聚合特征重提取金字塔模块（WAFR），解决了由于占用像素数量少而导致放牧牲畜识别精度低的问题[48]。但由于无人机图像中物体所占像素和特征数量有限，检测精度较低。因此，基于无人机图像的小物体检测仍然是一个重大挑战。

4. 小型浮水车的物体检测

随着深度学习越来越多地应用于物体检测，一些研究人员也将其应用于水面漂浮物检测。深度学习在水面漂浮物中的应用，为及时检测和处理漂浮物，提高江河湖泊监管水平做出了巨大贡献。Li等人使用安装在无人船上的相机从南中国海收集海洋图像，经过数据增强后获得了包含4000张图像的数据集。基于YOLOv3和DenseNet的融合模型用于检测海面船只目标[35]。Zhang等人使用从北京北运河某处视频监控三天中提取的数据集，检测漂浮在水面上的塑料袋和瓶子。采用的检测模型为以VGG16为特征提取器的Faster R-CNN，从VGG4中选择conv3_5和conv3_16进行特征融合，以提高小物体的检测精度[49]。2020年，van等人利用从印度尼西亚雅加达五座不同水桥采集的视频构建了塑料漂浮垃圾监测数据集，并通过两轮检测识别塑料漂浮物。使用更快的R-CNN检测可能包含塑料漂浮物体的区域，然后使用基于预先训练的COCO的Inception V2来确定这些区域是否含有塑料[33]。

This entry is adapted from the peer-reviewed paper 10.3390/su151410751

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.