Submitted Successfully!
To reward your contribution, here is a gift for you: A free trial for our video production service.
Thank you for your contribution! You can also upload a video entry or images related to this topic.
Version Summary Created by Modification Content Size Created at Operation
1 -- 1166 2022-04-26 09:21:53 |
2 update references and layout -10 word(s) 1156 2022-04-27 03:50:54 |

Video Upload Options

Do you have a full video?

Confirm

Are you sure to Delete?
Cite
If you have any further questions, please contact Encyclopedia Editorial Office.
Kowalczyk, P.; Izydorczyk, J.; , . Bounding Box Based Perception Modules. Encyclopedia. Available online: https://encyclopedia.pub/entry/22276 (accessed on 01 September 2024).
Kowalczyk P, Izydorczyk J,  . Bounding Box Based Perception Modules. Encyclopedia. Available at: https://encyclopedia.pub/entry/22276. Accessed September 01, 2024.
Kowalczyk, Paweł, Jacek Izydorczyk,  . "Bounding Box Based Perception Modules" Encyclopedia, https://encyclopedia.pub/entry/22276 (accessed September 01, 2024).
Kowalczyk, P., Izydorczyk, J., & , . (2022, April 26). Bounding Box Based Perception Modules. In Encyclopedia. https://encyclopedia.pub/entry/22276
Kowalczyk, Paweł, et al. "Bounding Box Based Perception Modules." Encyclopedia. Web. 26 April, 2022.
Bounding Box Based Perception Modules
Edit

Perception modules use raw data streams obtained from sensors mounted on car like camera, radar or lidar devices to recognize and interpret the surroundings. Raw data collected by sensors must be properly interpreted and processed to be understood by computer. This type of analysis is carried out by algorithms supported mainly by many trained neural networks (detectors). Most perception modules based on computer vision systems use bounding boxes to mark the recognized objects in each separate frame of the video stream.

quality metrics image detection bounding box

1. Perception Module

Perception modules use raw data streams obtained from sensors mounted on car like camera, radar or lidar devices to recognize and interpret the surroundings. Raw data collected by sensors must be properly interpreted and processed to be understood by computer. This type of analysis is carried out by algorithms supported mainly by many trained neural networks (detectors). Most perception modules based on computer vision systems use bounding boxes to mark the recognized objects in each separate frame of the video stream. Bounding boxes studied are rectangles with sides aligned in parallel to the sides of the frame, and thus they can be stored as four coordinates of their opposite corners. Each perception module in a vehicle is specialized in a particular task. Bounding boxes are used to mark objects such as pedestrians, other vehicles, their lights (separately), road signs, speed bumps and traffic light signalization in streams of video data [1][2][3][4][5][6]. Based on this description, the vehicle steering system can make decisions in accordance with pre-programmed protocols and logic called ADAS (Advanced Driver Assistance Systems). It is thus fundamentally important that this description is adequate, detailed and reliable, to secure the basis on which the decisions are made.

2. Testing and Verification of Car Perception

In the process of developing perception modules, it is important to check the quality of their interpretation of data from sensors. Such verification must be carried out with high regularity if essential changes are introduced into their algorithms. Data collected by the sensors is saved during the original data collection process which involves fleet of testing cars with mounted sensors on board and logging machines to save raw data (for example, video from camera mounted in front of vehicle). When such data is returned to the laboratory, it can be reused in the resimulation process, which is the coordinated reconstruction of the time-ordered stream of information from sensors. Subsequently, this information is implanted into the input of the perception module version that is being tested at the moment. In this way, perception model results for given scenes are obtained. Those results consist of bounding boxes describing different elements of surrounding recorded on frames coming from the camera. In order to train detectors and verify its effects it is necessary to first describe exactly what in collected by the sensors data should be found and interpreted by the perception modules. The reference system towards which can be compared the results that come from the detectors is called ground truth (GT). To create this reference, raw video data needs to be labeled manually which means creation of description of the expected results from the perception modules, frame by frame. This is handled by the staff of appropriately trained people who manually analyze the collected sensor data and label it. It is done according to predefined principles, it is laborious and time-consuming. Based on these additional data, it is finally possible to calculate the quality of the results obtained from the perception module, which in a broad context enables their development and evaluation of effectiveness in real conditions. In order to reduce the time and hardware complexity of such analysis high automation is required. Well designed algorithm provides reproducibility of the evaluation results. Research on the development of the perception of smart vehicle are related to human safety and are designed to minimize the amount and damage caused by road accidents. It is therefore necessary to create methodology that will reliably and objectively assess the quality of prototypes and enable quick and effective problem localization [7]. The requirements such as, for example, SOTIF [8] (Safety Of The Intended Functionality) are created so that all car manufacturers, researchers and lawmakers could use a universal set of requirements, recommendations and good practices. This document underlines the need to confirm the effectiveness of ADAS in different situations for all functionalities those systems provides.

3. Evaluation Methodology

Huge amount of work and resources devoted to acquiring, storing and preparing them motivates the creation of evaluation methodology that make the best use of them. There is a need to ensure efficiency of development, testing and validation process of perception modules, to evaluate how well the system output depicts the stream of ground truth. This means that task involves a methodology for the comparison of two rectangles as well as sequences of them that will provide specific information relevant to context of module specialization. Amounts of data that needs to be analyzed introduces the need for full automation and repeatability of such a process. This methodology has to be clearly decisive therefore, it is important to design approaches that will allow the synthesis of detailed conclusions to use the potential of collected data. It is to describe a novel evaluation methodology for perception modules used in automotive vehicles. Designed solutions should aid engineers to estimate quality of detectors working on video data that comes from camera mounted at the front of the moving vehicle. It is well suited to compare GT and detectors output as a form of bounding boxes. It provides tools to assess local quality in separate pictures and summarize results in sequence of frames (tracking quality) while understanding the need of quick response in traffic conditions on the road. Presented methodology can serve as precise definition of correct recognition that highlights the fact that various types of objects, although all described by rectangles require their own special approach to evaluation. It is achieved by focusing the measures on different aspects of rectangles which allows to filter specific information about the comparison and assign appropriate meaning to it for the whole analysis. Quality measure can be successfully used as a base for matching algorithm but beside the definition of true positive it should be treated as a metric that is directly interpreted and passed to higher levels of evaluation chain—bounding box sequence analysis. To help with interpretation of results that was proposed ways to visualize output of it. Both for quality of GT representation, as well as alerts of false positives which are natural consequence of matching algorithm. Methodology was presented separately for the following object classes examples: pedestrians, moving vehicles, traffic lights and signs. To achieve target of adaptation to different classes of objects and different applications in the evaluation process (matching, quality summary, combining the sequence of false positive results) methodology has to be parameters reliant. The calibration process—meaning of all parameters and their influence on final results is described. [1][2][3][4][5][6][7][8]

References

  1. Jiang, S.; Liang, S.; Chen, C.; Zhu, Y.; Li, X. Class Agnostic Image Common Object Detection. IEEE Trans. Image Process. 2019, 28, 2836–2846.
  2. Kim, H.; Kim, C. Locator-Checker-Scaler Object Tracking Using Spatially Ordered and Weighted Patch Descriptor. IEEE Trans. Image Process. 2017, 26, 3817–3830.
  3. Long, Y.; Gong, Y.; Xiao, Z.; Liu, Q. Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks. IEEE Trans. Geosci. Remote Sens. 2017, 55, 2486–2498.
  4. Ren, W.; Huang, K.; Tao, D.; Tan, T. Weakly Supervised Large Scale Object Localization with Multiple Instance Learning and Bag Splitting. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 405–416.
  5. Zeng, X.; Ouyang, W.; Yan, J.; Li, H.; Xiao, T.; Wang, K.; Liu, Y.; Zhou, Y.; Yang, B.; Wang, Z.; et al. Crafting GBD-Net for Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 2109–2123.
  6. Zhang, X.; Cheng, L.; Li, B.; Hu, H. Too Far to See? Not Really!—Pedestrian Detection With Scale-Aware Localization Policy. IEEE Trans. Image Process. 2018, 27, 3703–3715.
  7. Uřičář, M.; Hurych, D.; Krizek, P.; Yogamani, S. Challenges in Designing Datasets and Validation for Autonomous Driving. arXiv 2019, arXiv:1901.09270.
  8. ISO/PAS 21448:2019; Road Vehicles—Safety of the Intended Functionality. International Organization for Standardization: London, UK, 2019.
More
Information
Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register : , ,
View Times: 484
Revisions: 2 times (View History)
Update Date: 27 Apr 2022
1000/1000
ScholarVision Creations