Vision-Based Pose Estimation of Non-Cooperative Target

Vision-Based Pose Estimation of Non-Cooperative Target: History

View Latest Version

Please note this is an old version of this entry, which may differ significantly from the current revision.

Contributor:

Quan Sun

Xuhui Pan

Xiao Ling

Bo Wang

Qinghong Sheng

Jun Li

Zhijun Yan

Ke Yu

Jiasong Wang

In the realm of non-cooperative space security and on-orbit service, a significant challenge is accurately determining the pose of abandoned satellites using imaging sensors. Traditional methods for estimating the position of the target encounter problems with stray light interference in space, leading to inaccurate results.

non-cooperative targets
stray light interference
vision-based pose estimation

1. Introduction

As human exploration and development of outer space advances, countries demand higher levels of space technology ^[1]. Some of the key challenges in the aerospace field are spacecraft rendezvous and docking, on-orbit capture and repair of malfunctioning satellites, and space debris removal ^[2]. These challenges require the ability to perform rendezvous, docking, and capture of non-cooperative targets ^[3]. However, this task depends on the relative pose measurement of non-cooperative targets, which is difficult to achieve due to the poor quality of space images. Space images often have low contrast and texture and are affected by stray light in space. Non-cooperative targets lack artificial markers and feature cursors for auxiliary measurement, making it hard to obtain geometric, grayscale, depth, and other information about the target surface ^[4]. Various factors limit the availability of samples, which poses problems and challenges for attitude measurement.

There are various methods to achieve the pose measurement of non-cooperative targets, depending on the sensors used. These methods include visual target measurement, scanning laser radar measurement, non-scanning three-dimensional laser imaging measurement ^[5], pose measurement method based on multi-sensor fusion ^[6], and so on. The visual measurement method uses a camera to obtain the target image. This method is simple and does not require complex structures or too many devices. It can measure the target with only a camera and a computer, but it requires high computing power. Binocular vision can calculate the target distance and real size using the principle of triangulation, which is more suitable for the pose measurement of space non-cooperative targets ^[7]. However, this method also requires that the pose estimation algorithm can detect and process image feature information. Moreover, the optical images are more vulnerable to stray light, which affects the recognition and detection of space targets and indirectly leads to the scarcity of data set samples.

Currently, deep learning methods have been applied to various fields beyond image recognition, and the Transformer model is a rising star in the field of non-cooperative target detection and recognition. After the introduction of the Transformer structure from natural language processing to computer vision, it has broken the limited receptive field constraint of CNN. It has gained significant attention due to its advantages, such as not requiring proposals like Faster R-CNN, not using anchors like YOLO, not needing centers or post-processing steps like NMS, as in CenterNet, and directly predicting detection boxes and classes. The Backbone, as a feature extraction network, primarily extracts relevant information from images for subsequent stages. The role of the Neck is to fuse and enhance the features extracted by the Backbone before providing them to the Head for detection. The Head employs the previously extracted features to predict the position and class of objects ^[8]. As a target detection method, DETR transformed Transformers into the field of object detection, opening up new research avenues ^[9]. YOLOS is a series of ViT-based object detection models with minimal modifications and inductive biases ^[10]. Additionally, DETR has various related variants. To address the slow convergence issue of DETR, researchers proposed Deformable DETR and TSP-FCOS and TSP-RCNN ^[11]^[12]. Deformable DETR uses deformable convolution to effectively solve the slow convergence and low detection accuracy for small objects in sparse spatial positioning. ATC primarily alleviates redundancy in the attention maps of DETR and the problem of feature redundancy as the encoder deepens. It is evident that the Transformer network in the Neck section has mature research solutions that can significantly enhance accuracy. Furthermore, in the context of non-cooperative target issues, appropriate modifications can prevent the loss of information when reading patch information. This approach can retain more feature information, considering the scarcity of information sources.

2. Traditional and Deep Learning Methods

To acquire target model information in noisy environments, some traditional research methods transform pose estimation problems into template matching problems, utilizing essential matrices for pose initialization. Pose calculation involves image filtering, edge detection, line extraction, and stereo matching. A three-dimensional model of non-cooperative micro and nanosatellites is reconstructed using a stereo vision system ^[13]. Subsequently, a method based on feature matching estimates the target’s relative pose, followed by ground experiments to assess the algorithm’s accuracy. Segal S et al. ^[14] employ the principles of binocular vision measurement and utilize an Extended Kalman Filter to track and observe target feature points, achieving pose measurement for non-cooperative spacecraft. Finally, a trial system for estimating non-cooperative target poses is constructed. However, non-cooperative images often vary in quality, and traditional methods suffer significant accuracy reduction with blurry or smoothly-edged targets, making them inadequate for complex non-cooperative target measurements. Despite proposing algorithms based on horizontal and vertical feature lines to derive fundamental matrices without using paired point information, the reliance on high-quality imagery contradicts the scarcity of suitable non-cooperative target image datasets. As a result, these methods face significant limitations in practical applications.

Deep learning methods do not depend on the target model, do not need manual feature design, and have better generalization abilities when the training data are adequate. Li K et al. ^[15] proposed a method that outperforms the heatmap and regression-based methods and improves the uncertainty prediction. Zhu Z et al. ^[16] suggested an algorithm that can effectively suppress interference points and enhance the accuracy of non-cooperative target pose estimation. Despond F T ^[17] used a novel convolutional model to estimate the relative x, y and attitude of the target spacecraft. Deep learning methods are more versatile and robust for different targets and scenarios than traditional methods and can be more effectively applied to non-cooperative pose estimation.

3. Small-Sample Training

To address the challenge of pose estimation for non-cooperative space targets with limited real samples, researchers have also turned to deep learning methods and conducted a series of studies. As the most mature image processing networks, neural network approaches have been widely employed in non-cooperative target pose estimation, forming the basis for numerous improved and optimized algorithms capable of addressing various scenarios. Pasqualetto Cassinis L et al. ^[18] present a fusion of convolutional neural network-based feature extraction and the CEPPnP (efficient Procrustes perspect-n-points) method, combined with Extended Kalman Filtering for non-cooperative target pose estimation. Hou X et al. ^[19] introduce a hybrid artificial neural network estimation algorithm based on dual quaternion vectors. Ma C et al. ^[20] propose a Neural Network-Enhanced Kalman Filter (NNEKF), innovatively improving filter performance using the virtual observation of inertial characteristics. Huan W et al. ^[21] employ existing object detection networks and keypoint regression networks to predict 2D keypoint coordinates, reconstructing a 3D model through multi-viewpoint triangulation and minimizing 3D coordinates with nonlinear least squares to predict position and orientation. Li Xiang et al. ^[22] designed a non-cooperative target pose estimation network based on the Google Inception Net model.

Applications of the proposed MEGNN-based method to PHM 2010 milling TCM dataset and experiments demonstrate it outperforms three DL-based methods (CNN, AlexNet, ResNet) under small samples ^[23]. Pan T et al. ^[24] proposed a generative adversarial network (GAN), which is considered a promising way to solve the problem of small samples. Ma et al. ^[25] proposed a face recognition method based on sparse representation of deep learning features. This method first extracts face features using deep CNN and then classifies the obtained face features by sparse representation. Experiments prove that this method has higher recognition accuracy, which can improve by 6–60% compared with traditional methods, can effectively cope with the interference caused by intra-class changes, such as lighting, pose, expression, and occlusion, and has a greater advantage when encountering small sample problems. Despite the application of deep learning methods to space target scenarios, their efficacy is still hampered by the scarcity of actual samples, often relying on simulation datasets for training, leaving room for improvement in accuracy and methodology.

4. Stray Light

During the process of collecting space signals using optical sensors, non-target light information is captured in the form of stray light, and such interference is challenging to completely suppress or eliminate. Correlation methods can only reduce the impact of stray light interference ^[26]. For complex space environments, many studies have also incorporated methods for handling unique spatial noise. Yang Ming et al. ^[27] address the issue of significant lighting and Earth background effects on non-cooperative spacecraft attitude measurement in space, proposing an end-to-end attitude estimation method based on convolutional neural networks with AlexNet and ResNet architectures. Compared to using regression methods alone for attitude estimation, this approach effectively reduces the average absolute error, standard deviation, and maximum error of attitude estimation. Synthetic images used for network training adequately consider factors such as noise and lighting in orbit. Additionally, Sharma S et al. ^[28] introduce the SPN (spacecraft pose network) model, which trains the network using grayscale images. The SPN model consists of three branches, with the first using a detector to detect the boundary boxes of the target in the input image and the other two branches using regions within the 2D boundary boxes to determine the relative pose. The improvement in accuracy methods also brings up another issue: the scarcity of samples in space target data. To address the problem of small samples in space target data, the dataset of the target is built using Unity3d2019 ^[29] software. To fully simulate the space lighting environment, the brightness of simulated sunlight in the environment is randomly set, starry background noise is randomly added, and data normalization is performed for data enhancement. Jiang Zhaoyang et al. ^[30] designed a dual-channel neural network based on VGG and DenseNet architectures to locate the pixel corresponding to feature points in the image and provide their corresponding pixel coordinates, proposing a neural network pruning method to achieve network lightweighting. Addressing the interference of space lighting and the issue of small samples, Sharma S et al. ^[31] present a monocular image-based pose estimation network. Phisannupawong T et al. and Chen B et al. ^[32]^[33] achieve 6-DOF pose estimation for non-cooperative spacecraft using pre-trained deep models. Despite Sonawani S et al. ^[34] being the first to create a dataset for non-cooperative targets using a semi-physical simulation platform, overall, there has not been extensive research into algorithms that simultaneously handle stray light and small sample sizes.

This entry is adapted from the peer-reviewed paper 10.3390/aerospace10120997

References

Bao, W. Research status and development trend of aerospace vehicle control technology. Acta Autom. Sin. 2013, 39, 697–702.
Liang, B.; Du, X.; Li, C.; Xu, W. Advances in space robot on-orbit servicing for non-cooperative spacecraft. Jiqiren (Robot) 2012, 34, 242–256.
Li, R.; Wang, S.; Long, Z.; Gu, D.U. Monocular visual odometry through unsupervised deep learning. arXiv 2017, arXiv:1709.06841.
Hao, G.; Du, X. Research status of optical measurement of space non-cooperative target pose. Prog. Laser Optoelectron. 2013, 50, 246–254.
Yu, L.; Feng, C.; Ying, W. Spacecraft relative pose measurement technology based on lidar. Infrared Laser Eng. 2016, 45, 0817003.
Feng, C.; Wu, H.; Chen, B. Pose parameter estimation between spacecraft based on multi-sensor fusion. Infrared Laser Eng. 2015, 44, 1616–1622.
Kendall, A.; Grimes, M.; Cipolla, R. Convolutional networks for real-time 6-DOF camera relocalization. arXiv 2015, arXiv:1505.07427.
Zhu, X.; Jiang, Q. Research on UAV image target detection based on CNN and Transformer. J. Wuhan Univ. Technol. 2022, 44, 323–331.
Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In European Conference on Computer Vision; Springer International Publishing: Cham, Switzerland, 2020; pp. 213–229.
Fang, Y.; Liao, B.; Wang, X.; Fang, J.; Qi, J.; Wu, R.; Niu, J.; Liu, W. You only look at one sequence: Rethinking transformer in vision through object detection. Adv. Neural Inf. Process. Syst. 2021, 34, 26183–26197.
Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable detr: Deformable transformers for end-to-end object detection. arXiv 2020, arXiv:2010.04159.
Sun, Z.; Cao, S.; Yang, Y.; Kitani, K.M. Rethinking transformer-based set prediction for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 3611–3620.
Terui, F.; Kamimura, H.; Nishida, S. Motion estimation to a failed satellite on orbit using stereo vision and 3D model matching. In Proceedings of the 2006 9th International Conference on Control, Automation, Robotics and Vision, Singapore, 5–8 December 2006; pp. 1–8.
Segal, S.; Carmi, A.; Gurfil, P. Vision-based relative state estimation of non-cooperative spacecraft under modeling uncertainty. In Proceedings of the 2011 Aerospace Conference, Big Sky, MT, USA, 5–12 March 2011; pp. 1–8.
Li, K.; Zhang, H.; Hu, C. Learning-Based Pose Estimation of Non-Cooperative Spacecrafts with Uncertainty Prediction. Aerospace 2022, 9, 592.
Zhu, Z.; Xiang, W.; Huo, J.; Yang, M.; Zhang, G.; Wei, L. Non-cooperative target pose estimation based on improved iterative closest point algorithm. J. Syst. Eng. Electron. 2022, 33, 1–10.
Despond, F.T. Non-Cooperative Spacecraft Pose Estimation Using Convolutional Neural Networks. Ph.D. Thesis, Carleton University, Ottawa, ON, Canada, 2022.
Pasqualetto Cassinis, L.; Fonod, R.; Gill, E.; Ahrns, I.; Gil Fernandez, J. Cnn-based pose estimation system for close-proximity operations around uncooperative spacecraft. In Proceedings of the AIAA Scitech 2020 Forum, Orlando, FL, USA, 6–10 January 2020; p. 1457.
Hou, X.; Yuan, J.; Ma, C.; Sun, C. Parameter estimations of uncooperative space targets using novel mixed artificial neural network. Neurocomputing 2019, 339, 232–244.
Ma, C.; Zheng, Z.; Chen, J.; Yuan, J. Robust attitude estimation of rotating space debris based on virtual observations of neural network. Int. J. Adapt. Control Signal Process. 2022, 36, 300–314.
Huan, W.; Liu, M.; Hu, Q. Pose estimation for non-cooperative spacecraft based on deep learning. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 3339–3343.
Li, X. Design of Spatial Non-Cooperative Target Pose Estimation Algorithm Based on Deep Learning. Master’s Thesis, Harbin University of Technology, Harbin, China, 2019.
Zhou, Y.; Zhi, G.; Chen, W.; Qian, Q.; He, D.; Sun, B.; Sun, W. A new tool wear condition monitoring method based on deep learning under small samples. Measurement 2022, 189, 110622.
Pan, T.; Chen, J.; Zhang, T.; Liu, S.; He, S.; Lv, H. Generative adversarial network in mechanical fault diagnosis under small sample: A systematic review on applications and future perspectives. ISA Trans. 2022, 128, 1–10.
Li, Y. Research and application of deep learning in image recognition. In Proceedings of the 2022 IEEE 2nd International Conference on Power, Electronics and Computer Applications (ICPECA), Shenyang, China, 21–23 January 2022; pp. 994–999.
Xu, Z. Research on Stray Light Suppression and Processing Technology of Space-Based Space Target Detection System. Ph.D. Thesis, University of Chinese Academy of Sciences (Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences), Beijing, China, 2021.
Yang, M. Research on Multi-Mode Intelligent Reconstruction Algorithm for Spatial Non-Cooperative Targets Based on Deep Learning. Master’s Thesis, Harbin University of Technology, Harbin, China, 2019.
Sharma, S.; D’Amico, S. Pose estimation for non-cooperative rendezvous using neural networks. arXiv 2019, arXiv:1906.09868.
Unity Technologies. Unity; ; Unity Technologies: San Francisco, CA, USA, 2019.
Jiang, Z. Non-Cooperative Spacecraft Monocular Vision Pose Measurement Method Based on Deep Learning. Ph.D. Thesis, Harbin Institute of Technology, Harbin, China, 2021.
Sharma, S.; Beierle, C.; D’Amico, S. Pose estimation for non-cooperative spacecraft rendezvous using convolutional neural networks. In Proceedings of the 2018 IEEE Aerospace Conference, Big Sky, MT, USA, 3–10 March 2018; pp. 1–12.
Phisannupawong, T.; Kamsing, P.; Torteeka, P.; Channumsin, S.; Sawangwit, U.; Hematulin, W.; Jarawan, T.; Somjit, T.; Yooyen, S.; Delahaye, D.; et al. Vision-based spacecraft pose estimation via a deep convolutional neural network for noncooperative docking operations. Aerospace 2020, 7, 126.
Chen, B.; Cao, J.; Parra, A.; Chin, T.J. Satellite pose estimation with deep landmark regression and nonlinear pose refinement. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019; pp. 2816–2824.
Sonawani, S.; Alimo, R.; Detry, R.; Jeong, D.; Hess, A.; Amor, H.B. Assistive relative pose estimation for on-orbit assembly using convolutional neural networks. arXiv 2020, arXiv:2001.10673.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.