This entry provides a comprehensive overview of methods used in image matching. It starts by introducing area-based matching, outlining well-established techniques for determining correspondences. Then, it presents the concept of feature-based image matching, covering feature point detection and description issues, including both handcrafted and learning-based operators. Brief presentations of frequently used detectors and descriptors are included, followed by a presentation of descriptor matching and outlier rejection techniques. Finally, the entry provides a brief overview of relational matching.
The term “image matching” refers to the automatic establishment of correspondences between points, grayscale tones, features, relations, or other entities in overlapping images. It is a critical process for various photogrammetric and computer vision applications, such as the automatic detection of tie points for aerial triangulation, photo-triangulation, or Structure from Motion (SfM) applications
[1], relative orientation processes
[2], automatic collection of digital terrain/surface models (DTMs/DSMs)
[3], augmented reality applications
[4], and many other fields.
Three main categories of image matching can be distinguished, depending on the entity being matched: (i) area-based matching, (ii) feature-based matching, and (iii) relational matching
[5]. Area-based matching identifies correspondences based on the similarity of intensity or color values within image regions. Feature-based matching focuses on detecting and describing interest points that are robust under transformations such as scale, rotation, and illumination changes. Lastly, relational matching utilizes symbolic relationships between image structures, making it well-suited for template-based applications. To provide a clear overview of the main techniques used in image matching,
Figure 1 illustrates the three primary approaches, highlighting their fundamental principles and well-established techniques, as well as representative examples addressed in this entry. Conventional methods, which rely on heuristic-driven approaches, are used across all three categories, whereas learning-based methods, which use machine learning techniques, are mainly limited to feature-based matching. In the context of feature-based matching, two distinct subcategories can be identified: conventional approaches, which involve handcrafted detectors and descriptors, and learning-based approaches, which include training models to identify and describe features, while relying on the availability of high-quality training data. Handcrafted operators are manually designed algorithms that detect and describe features using predefined rules, mathematical models, or methods (e.g., corner detection, gradient analysis) to extract distinctive features from images. Learning-based operators use machine learning techniques to detect and describe features after being trained on large datasets to learn feature representations and correspondences.
Figure 1. Overview of image matching methods, outlining well-established techniques and representative examples.
This entry focuses on feature-based matching, which is the most commonly used method, and also outlines some basic fundamentals of the other two matching methods (area-based matching and relational matching). In the context of feature-based matching, some basic operators for detecting and describing interest points are presented, including both handcrafted and learning-based operators. In addition, the main methods for eliminating invalid correspondences are presented, which ensure that the final matching result is considered acceptable.