Suitable-Matching Areas’ Selection Method Based on Multi-Level Saliency: History
Please note this is an old version of this entry, which may differ significantly from the current revision.

Scene-matching navigation is one of the essential technologies for achieving precise navigation in satellite-denied environments. Selecting suitable-matching areas is crucial for planning trajectory and reducing yaw.

  • scene-matching navigation
  • suitable-matching areas’ selection
  • salient feature

1. Introduction

In recent international wars, the form of wars has gradually transformed from informationization to intelligence, and unmanned warfare has become the primary development trend. Unmanned combat platforms supported by artificial intelligence technology are developing strongly. UAV-borne weapons and equipment with high sensing and strong strike capabilities have become the key to changing the battlefield pattern [1]. Among them, the high-precision navigation and positioning system is the core for unmanned combat platforms. It can help the combat systems achieve autonomous reconnaissance and precision strikes. It has become a research hotspot for aircraft autonomous navigation [2].
Navigation and positioning systems mainly fall into two categories: single navigation and positioning systems and composite navigation and positioning systems [3]. The single navigation and positioning system mainly includes INS (inertial navigation and positioning system), GNSS (global navigation satellite system), and visual navigation and positioning system. However, the navigation error of INS will accumulate with the voyage, so it is unsuitable for long-endurance navigation. GNSS must rely on satellite navigation instructions and cannot independently complete navigation calculations. It is highly vulnerable to the enemy’s priority attacks during wartime, resulting in satellite denial [4]. At the same time, satellites are susceptible to interference and deception, which significantly limits their application scenarios [5,6]. Therefore, the single navigation and positioning systems are challenging to meet the diverse needs of modern warfare. The composite navigation and positioning systems use various navigation technologies. They can quickly switch to other modes when one navigation method fails. This is why they can ensure the reliability and continuity of navigation and positioning. They have excellent results in actual combat tests, which makes them an essential development method of navigation and positioning.
As a visual navigation and positioning technology, scene-matching navigation [7,8,9] has the advantages of strong autonomy, high terminal guidance accuracy, no accumulation of errors with the voyage, and strong anti-interference ability. It is often combined with INS and GNSS to improve the anti-interference ability and achieve medium and long-range navigation. At the same time, it also helps solve the problem of autonomous navigation under satellite-denied conditions [10].
Scene-matching navigation selects regional features as the source of information [11]. When the aircraft arrives at the pre-planned scene matching area, it captures scene images along the navigation trajectory or adjacent to the target area in real time via the image sensor [12]. Then, it performs matching operations on the real-time and pre-stored reference images. Finally, the relative displacement of the two is used to calculate aircraft position information [9].
Suitable-matching area selection, reference and real-time image matching, matching result decision making, and matching position inversion are the four critical steps of scene-matching navigation. Among them, suitable-matching area selection is the first step in scene-matching navigation. It is the basis for reference image preparation and track planning and is also the key to correcting positioning and reducing yaw [13]. Since the matching performance of the reference image is closely related to the matching probability and accuracy, selecting the suitable-matching area will directly affect the overall performance of the navigation and positioning system.
Recently, research on matching algorithms has been relatively mature. However, research on suitable-matching areas’ selection and matching performance evaluation needs to catch up. Traditional suitable-matching analysis methods extract features from images and establish a mapping model between feature parameters and matching probabilities [14,15,16]. However, selecting features and setting threshold parameters will affect the analysis results, making the model less versatile when facing different scene categories. In addition, this research is mainly oriented towards military engineering applications, so there is little relevant public data [17]. Although deep learning technology helps us to extract complex potential features in images and overcomes the limitations of matching feature extraction, it requires a large number of open-source datasets to support it [18].

2. Suitable-Matching Areas’ Selection Method Based on Multi-Level Saliency

Johnson first proposed the concept, theory, and method of suitable-matching areas’ selection in 1972 [21]. The purpose is to select multiple sub-areas with good adaptability and specific sizes in a large-size reference image as navigation reference images for scene matching. The suitable-matching areas generally satisfy the following four points:
  • Richness: The richer the image information contained in the selected area, the more conducive it is to the later matching calculation;
  • Stability: Differences in imaging quality will cause changes in scene features, so matching in areas with unstable features is more likely to fail;
  • Uniqueness: If the selected area contains similar targets, it will increase the probability of mismatching;
  • Salience: Areas with significant features usually have noticeable feature differences from surrounding areas, which is helpful for distinguishing the foreground from the background.
How to analyze the matching ability of the image area and determine the selection criteria has always been the focus of research on scene-matching navigation technology. There are three main methods for selecting the good areas: manual selection methods based on experience, hierarchical screening methods based on multiple feature indicators, and pattern classification methods based on machine learning.
Since the manually selected suitable-matching areas have yet to pass scientific verification and lack objectivity, many researchers hope to perform area screening by extracting appropriate image features so that the selected areas meet the requirement. The hierarchical screening method is the most mature technology for selecting suitable-matching areas. It sets different image features as feature indicators and then uses pattern classification or multi-attribute decision-making methods to establish the mapping between scene adaptability and feature indicators. In [22], it constructs a metric function by fusing edge density, average edge intensity, and edge direction dispersion to solve the problem of automatic suitable-matching areas’ selection in local textureless target tracking. A method based on information entropy is proposed in [23], which uses improved information entropy, Frieden gray entropy, and normalized average mutual information as feature indicators. To solve the problem of changes in matching probability caused by noise in real-time images, ref. [24] improves and proposes two new indicators: phase correlation length and effective contour density. It also offers a matching suitability analysis method based on the fusion of multi-indexes by evidence theory. In [25], the combined weighting method calculates the combined weight of multiple feature attributes and selects suitable SAR (Synthetic Aperture Radar) scene-matching areas based on the comprehensive evaluation value.
The hierarchical screening method is easy to implement and highly interpretable. However, in constructing the mapping function, only the impact of feature indicators on the matching probability is considered, and the connection between different features is ignored. In addition, the fixed feature indicators and thresholds will cause the algorithm to be unable to cope with different types of scenes, making the algorithm less robust.
In recent years, some work has been performed on suitable-matching areas’ selection models based on machine learning. In [26,27], researchers used different image features to establish mapping relationships and transformed the selection problem into a clustering discrimination problem. They built SVM classification models to distinguish scene-suitable-matching areas from non-suitable-matching areas automatically. The essence of these methods is still to use traditional image feature parameters to train the classifier to predict adaptability. The trained model can present stable selection results for a particular type of image, but it still needs better versatility. On this basis, a matching probability prediction model, based on the ResNet deep learning network, is designed in [2,28] to guide the selection of the suitable-matching areas by predicting the matching probability of the subgraph. To enhance the robustness of the model, ref. [29] subdivides image data according to typical scenes and selects the suitable-matching areas from the perspective of multi-classification problems. Methods based on deep learning make the model more robust and more flexible. However, these methods require a large amount of annotated images as support. The output results are also closely related to the performance of the matching algorithm used during training, and the overall interpretability is poor. Unbalanced sample categories are prone to occur when the amount of data is small. Therefore, how to use deep learning to conduct adaptability analysis with a small number of samples is a great test for the network’s generalization ability.

This entry is adapted from the peer-reviewed paper 10.3390/rs16010161

This entry is offline, you can click here to edit this entry!
Video Production Service