Multi-Eye to Robot Indoor Calibration Dataset

This entry is adapted from the peer-reviewed paper 10.3390/info14060314

The METRIC dataset comprises more than 10,000 synthetic and real images of ChAruCo and checkerboard patterns. Each pattern is securely attached to the robot's end-effector, which is systematically moved in front of four cameras surrounding the manipulator. This movement allows for image acquisition from various viewpoints. The real images in the dataset encompass multiple sets of images captured by three distinct types of sensor networks: Microsoft Kinect V2, Intel RealSense Depth D455, and Intel RealSense Lidar L515. The purpose of including these images is to evaluate the advantages and disadvantages of each sensor network for calibration purposes. Additionally, to accurately assess the impact of the distance between the camera and robot on calibration, researchers obtained a comprehensive synthetic dataset. This dataset contains associated ground truth data and is divided into three different camera network setups, corresponding to three levels of calibration difficulty based on the cell size.

robot-world hand-eye calibration camera network calibration calibration dataset

1. Introduction

The use of camera networks has become increasingly popular in various computer vision applications, such as human pose estimation, 3D object detection and 3D reconstruction ^[1]^[2]^[3]^[4]. Multi-camera systems offer the advantage of monitoring larger areas and making several computer vision algorithms more robust against occlusion problems. These challenges frequently occur in complex real-world scenarios such as people tracking applications ^[5] or robotic workcells ^[6].

Calibrating a camera network is a crucial step in setups involving multiple cameras, and it typically involves determining intrinsic and extrinsic parameters. Intrinsic calibration is necessary to determine the internal sensor parameters required to accurately project the scene from each 3D camera reference frame onto the corresponding 2D image plane, and they can be obtained using algorithms such as Zhang’s or Sturm’s ^[7]^[8]. Extrinsic calibration is required to establish a single 3D reference frame shared by all sensors in the camera network, which is essential for multi-camera applications, since it allows the accurate localization of objects or people in the scene with respect to this common reference system. Both intrinsic and the extrinsic calibration involve an image acquisition phase where a calibration pattern is placed in different positions and orientations in front of the sensors. By detecting the pattern control points, an optimization process is performed to estimate camera parameters, which may involve, for example, minimizing the reprojection error ^[9]. This step is necessary to accurately determine the camera’s intrinsic and extrinsic parameters. The calibration pattern can be either a planar model, such as checkerboard or the ChAruCo pattern ^[10], or any other object whose shape is known and is showing elements that are easily recognizable ^[11]. Intrinsic parameters require calibration pattern images taken at short distances from the sensor in order to cover the entire image plane, whereas extrinsic calibration is often performed by keeping the calibration pattern at longer distances to ensure, for example, its simultaneous detection by multiple cameras. Hence, the two calibration processes are typically performed in two separate steps. Furthermore, intrinsic calibration is conducted separately for each sensor, and intrinsic parameters such as focal length and image center are occasionally provided by the sensor manufacturer; therefore, cameras are sometimes considered to be intrinsically calibrated.

Camera network calibration is required for several applications, including multi-camera navigation systems ^[12], people-tracking within camera networks ^[13], and surveillance systems ^[14]. Camera network calibration is also critical in robotic scenarios ^[15]^[16], especially when dealing with a robotic workcell composed of a robot arm surrounded by a camera network installed to monitor the workcell area ^[5]^[17]^[18]. In such cases, it is essential to provide the robot with accurate information about its working environment. Simply estimating the relative positions among the cameras is not enough. The single reference frame shared by all viewpoints is required, which may be an external world reference frame or, more commonly, it may coincide with the robot’s base. Defining the reference frame coincident with the robot’s base allows the robot to locate an object of interest with respect to itself for several tasks, such as industrial and medical applications ^[19]^[20].

2. METRIC—Multi-Eye to Robot Indoor Calibration Dataset

Several methods have been proposed in the literature targeting camera network calibration and robot-world hand-eye calibration, and in both use cases, a planar calibration pattern such as ChAruCo and checkerboard is typically used. However, these algorithms have been typically evaluated on dedicated setups and specific datasets, limiting the comparison among different methods. Currently, there is a lack of general datasets that can be used to evaluate the performance of calibration algorithms and test their robustness under different conditions, such as variations in the distance between the camera and the calibration pattern.

2.1. Camera Network Calibration

Some of the camera network calibration techniques that share a common approach by using planar calibration patterns are described in more detail below.

Kim et al. ^[12] proposed an extrinsic calibration process of multi-camera systems composed by lidar-camera combinations for navigation systems. The proposed method used a planar checkerboard pattern, which was manually moved in front of the sensors during the calibration process. Furgale et al. ^[21] proposed Kalibr, a novel framework that employs maximum-likelihood estimation to jointly calibrate temporal offsets and geometric transformations of multiple sensors. The robustness of the approach is demonstrated through an extensive set of experiments, including the calibration of a camera and an inertial measurement unit (IMU). Tabb et al. ^[22], proposed a method for calibrating asynchronous camera networks. The method addresses the calibration of multi-camera systems without relying on the hardware or the synchronization level among cameras, which is typically a main factor strongly influencing camera network calibration results. Caron et al. ^[23] introduced an algorithm for the simultaneous intrinsic and extrinsic calibration of a multi-camera system using different models for each camera. The algorithm is based on minimizing the corner reprojection error computed on each camera using the corresponding projection model for each sensor, and they exploit a set of images of a calibration pattern, such as a checkerboard, manually moved at different distances and different positions in front of the cameras. Munaro et al. presented OpenPTrack, an open-source multi-camera calibration software designed for people-tracking in RGB-D camera networks ^[13]. They proposed a camera network calibration system that works on images acquired from all the sensors while manually moving a checkerboard within the tracking space to allow more than one camera to detect it, followed by a global optimization of the camera and checkerboard poses. All of the above methods address the calibration of specific camera network configurations using a checkerboard or a ChAruCo pattern, but they have been tested on their respective datasets for specific tasks, which may limit the comparability of different techniques using a general dataset.

2.2. Robot-World Hand-Eye Calibration

Several works in the literature have addressed the issue of robot-world hand-eye calibration, adopting planar calibration patterns. In a previous study ^[24], researchers proposed a non-linear optimization algorithm to solve the robot-world hand-eye calibration problem with a single camera. The proposed method involved the minimization of the corner reprojection error of a checkerboard that was rigidly attached to the robot’s end-effector and moved in front of the sensor at different positions and orientations during the image acquisition. In a scenario where a robot is surrounded by a camera network consisting of N sensors, this method must be applied N times, one for each sensor, to determine the pose of each camera with respect to the robot and thus to calculate the relative pose between the different cameras. Tabb et al. ^[25] proposed a robot-world hand-multiple-eye calibration procedure using a classic checkerboard and compared two main techniques, each based on a different cost function. The first cost function minimizes the difference of two transformation chains over n positions of the robot arm achieved during the image acquisition, and it is based on the Perspective-n-Point (PnP) problem ^[26] of estimating the rototranslation between a camera and the calibration pattern. The second cost function focuses on the minimization of the corner reprojection error. In addition, Li and Shah proposed two different procedures for robot-world hand-eye calibration using dual quaternions and Kronecker product, respectively ^[27]^[28]. All of these works focus on calibration within small-sized workcells, where the cameras are placed approximately 1 m from the robot, which limits the ability to analyze the robustness of different calibration methods—particularly as the distance between the cameras and the calibration pattern increases.

2.3. Calibration Dataset

Based on the previous analysis, it can be observed that most of the calibration methods have been developed for specific use cases, such as the calibration of a camera network or the calibration of one or more sensors with respect to a robot, which makes it challenging to evaluate the performance of different calibration methods on standardized benchmarks. In particular, two main limitations have been identified: (i) the lack of common datasets to compare different calibration methods, and (ii) calibration works mainly focused on small workcells and small camera networks.

Tabb et al. ^[22] released a dataset and the associated code that can be used to calibrate asynchronous camera networks. The dataset includes synthetic and real data aimed at calibrating a camera network with 12 Logitech c920 HD Pro Webcameras rigidly attached to the walls of a room, facing the centre of the scene. In addition, the authors captured a separate dataset specifically designed to calibrate a network of four moving cameras. In all three datasets, ChAruCo models were employed to calibrate a sensor network positioned approximately 0.70 m from the calibration pattern. In ^[29], Wang and Jang presented a dataset that was used to calibrate a camera network. The dataset was obtained by manually moving a classical checkerboard placed in front of a multi-camera system consisting of four sensors 0.5 m apart and approximately 1 m away from the calibration pattern. Their proposed method generalizes the hand-eye calibration problem, which jointly solves multi-eye-to-base problems in a closed form to determine the geometric transformation between sensors within the camera network. T. Hüser et al. ^[30] introduced a real-world dataset that included different recordings of a calibration checkerboard manually moved in front of sensors. The dataset was created to perform the intrinsic and extrinsic calibration of twelve synchronized cameras mounted on the walls of a small room, which were used to record and analyze the grasping actions of a monkey interacting with fruit and other objects. Another dataset, described in detail in ^[31], consists of a small number of Aruco calibration pattern images positioned about 0.5 m from the camera and used for object localization tasks. As far as datasets for testing robot-world hand-eye calibration methods are concerned, there are only a few available in the literature. One such dataset is published in ^[32], where the authors propose a set of images of a planar calibration pattern positioned approximately 1 m away from the robot. The pattern consists of a grid of circles and is used for the hand-eye calibration of a manipulator equipped with a monocular camera (PointGrey, Flea3) attached to the end-effector. Tabb et al. presented a dataset containing both synthetic and real data, which can be used to assess hand-eye calibration techniques. The authors captured several images of a checkerboard by controlling a robot arm equipped with a multi-camera system attached to its end-effector in various positions. The calibration pattern was positioned approximately 1 m away from the sensors during the image acquisition ^[25].

The main drawback of many of these datasets is the limited number of images available to test different calibration methods—usually not exceeding 100 images. Additionally, the datasets contain images of a specific calibration pattern that may not be used by other state-of-the-art methods due to the lack of suitable detectors, further limiting their applicability for evaluating the performance of other techniques.

References

Dong, Z.; Song, J.; Chen, X.; Guo, C.; Hilliges, O. Shape-aware multi-person pose estimation from multi-view images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 11158–11168.
Golda, T.; Kalb, T.; Schumann, A.; Beyerer, J. Human pose estimation for real-world crowded scenarios. In Proceedings of the 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Taipei, Taiwan, 18–21 September 2019; pp. 1–8.
Cortés, I.; Beltrán, J.; de la Escalera, A.; García, F. Sianms: Non-maximum suppression with siamese networks for multi-camera 3d object detection. In Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA, 19 October–13 November 2020; pp. 933–938.
Quintana, M.; Karaoglu, S.; Alvarez, F.; Menendez, J.M.; Gevers, T. Three-D Wide Faces (3DWF): Facial Landmark Detection and 3D Reconstruction over a New RGB–D Multi-Camera Dataset. Sensors 2019, 19, 1103.
Terreran, M.; Lamon, E.; Michieletto, S.; Pagello, E. Low-cost scalable people tracking system for human-robot collaboration in industrial environment. Procedia Manuf. 2020, 51, 116–124.
Zhu, L.; Menon, M.; Santillo, M.; Linkowski, G. Occlusion Handling for Industrial Robots. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October–24 January 2021; pp. 10663–10668.
Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334.
Sturm, P.F.; Maybank, S.J. On plane-based camera calibration: A general algorithm, singularities, applications. In Proceedings of the Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149), Fort Collins, CO, USA, 23–25 June 1999; Volume 1, pp. 432–437.
Tsai, R. A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robot. Autom. 1987, 3, 323–344.
Garrido-Jurado, S.; Muñoz-Salinas, R.; Madrid-Cuevas, F.J.; Marín-Jiménez, M.J. Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognit. 2014, 47, 2280–2292.
Shen, J.; Xu, W.; Luo, Y.; Su, P.C.; Cheung, S.C.S. Extrinsic calibration for wide-baseline RGB-D camera network. In Proceedings of the 2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP), Jakarta, Indonesia, 22–24 September 2014; pp. 1–6.
Kim, E.S.; Park, S.Y. Extrinsic calibration of a camera-LIDAR multi sensor system using a planar chessboard. In Proceedings of the 2019 Eleventh International Conference on Ubiquitous and Future Networks (ICUFN), Zagreb, Croatia, 2–5 July 2019; pp. 89–91.
Munaro, M.; Basso, F.; Menegatti, E. OpenPTrack: Open source multi-camera calibration and people tracking for RGB-D camera networks. Robot. Auton. Syst. 2016, 75, 525–538.
Hödlmoser, M.; Kampel, M. Multiple camera self-calibration and 3D reconstruction using pedestrians. In Proceedings of the Advances in Visual Computing: 6th International Symposium, ISVC 2010, Las Vegas, NV, USA, 29 November–1 December 2010; Proceedings, Part II 6. Springer: Berlin/Heidelberg, Germany, 2010; pp. 1–10.
Le, Q.V.; Ng, A.Y. Joint calibration of multiple sensors. In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, USA, 10–15 October 2009; pp. 3651–3658.
Zhu, Y.; Wu, Y.; Zhang, Y.; Qu, F. Multi-camera System Calibration of Indoor Mobile Robot Based on SLAM. In Proceedings of the 2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 3–5 December 2021; pp. 240–244.
Kroeger, O.; Huegle, J.; Niebuhr, C.A. An automatic calibration approach for a multi-camera-robot system. In Proceedings of the 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Zaragoza, Spain, 10–13 September 2019; pp. 1515–1518.
Mišeikis, J.; Glette, K.; Elle, O.J.; Torresen, J. Automatic Calibration of a Robot Manipulator and Multi 3D Camera System. In Proceedings of the 2016 IEEE/SICE International Symposium on System Integration (SII), Sapporo, Japan, 13–15 December 2016; pp. 735–741.
Sung, H.; Lee, S.; Kim, D. A robot-camera hand/eye self-calibration system using a planar target. In Proceedings of the IEEE ISR 2013, Seoul, Republic of Korea, 24–26 October 2013; pp. 1–4.
Šuligoj, F.; Jerbić, B.; Švaco, M.; Šekoranja, B.; Mihalinec, D.; Vidaković, J. Medical applicability of a low-cost industrial robot arm guided with an optical tracking system. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 3785–3790.
Furgale, P.; Rehder, J.; Siegwart, R. Unified temporal and spatial calibration for multi-sensor systems. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 1280–1286.
Tabb, A.; Medeiros, H.; Feldmann, M.J.; Santos, T.T. Calibration of Asynchronous Camera Networks: CALICO. arXiv 2019, arXiv:1903.06811.
Caron, G.; Eynard, D. Multiple camera types simultaneous stereo calibration. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 2933–2938.
Evangelista, D.; Allegro, D.; Terreran, M.; Pretto, A.; Ghidoni, S. An Unified Iterative Hand-Eye Calibration Method for Eye-on-Base and Eye-in-Hand Setups. In Proceedings of the 2022 IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA), Stuttgart, Germany, 6–9 September 2022; pp. 1–7.
Tabb, A.; Ahmad Yousef, K.M. Solving the robot-world hand-eye (s) calibration problem with iterative methods. Mach. Vis. Appl. 2017, 28, 569–590.
Schweighofer, G.; Pinz, A. Globally Optimal O (n) Solution to the PnP Problem for General Camera Models. In BMVC; BMVA Press: Leeds, UK, 2008; pp. 1–10.
Li, A.; Wang, L.; Wu, D. Simultaneous robot-world and hand-eye calibration using dual-quaternions and Kronecker product. Int. J. Phys. Sci. 2010, 5, 1530–1536.
Shah, M. Solving the robot-world/hand-eye calibration problem using the Kronecker product. J. Mech. Robot. 2013, 5, 031007.
Wang, Y.; Jiang, W.; Huang, K.; Schwertfeger, S.; Kneip, L. Accurate Calibration of Multi-Perspective Cameras from a Generalization of the Hand-Eye Constraint. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 1244–1250.
Hüser, T.; Sheshadri, S.; Dörge, M.; Scherberger, H.; Dann, B. JARVIS-MoCap Monkey Grasping Recordings and Annotations. 2022. Available online: https://doi.org/10.5281/zenodo.6982805 (accessed on 3 March 2023).
Skaloud, J.; Cucci, D.A.; Joseph Paul, K. Coaxial Octocopter Open Data with Digicam—IGN Calibration 2. 2021. Available online: https://doi.org/10.5281/zenodo.4705424 (accessed on 3 March 2023).
Koide, K.; Menegatti, E. General hand—Eye calibration based on reprojection error minimization. IEEE Robot. Autom. Lett. 2019, 4, 1021–1028.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.

Upload a video for this entry

Information

Subjects: Robotics; Others; Computer Science, Information Systems

Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :

Davide Allegro

Matteo Terreran

Stefano Ghidoni

View Times: 428

Update Date: 09 Jun 2023

Version	Summary	Created by	Modification	Content Size	Created at	Operation
1		Davide Allegro	--	1908	2023-06-09 10:02:45	\|
2	layout	Camila Xu	Meta information modification	1908	2023-06-09 10:08:24	\|