From the 1980s onward, remote sensing research had mainly been based on satellite data. Due to the cost of satellite launches, there were only a few remote sensing satellites available for a long time, and most satellite images required high costs to obtain limited data, except for a few satellites such as the Landsat series that were partially free. This also affected the direction of remote sensing research. During this period, many remote sensing index methods based on ground target spectral characteristics mainly used free Landsat satellite data. Other satellite data were less used, due to their high purchase costs.
Beside the high cost and lack of supply, remote sensing satellite data acquisition is also constrained by several factors that affect the observation ability and direction of research:
In the past decade, the emergence of multi-rotor unmanned aerial vehicles (UAV) has gradually changed the above-mentioned limitations in remote sensing research. This type of unmanned aircraft is pilotless, consumes no fuel, and does not require maintenance of turboshaft engines. These multi-copters are equipped with cheap but reliable brushless motors, which only require a small amount of electricity per flight. Users can schedule the entire flight process of a multi-copter, from takeoff to landing, and edit flight parameters such as passing points, flight speed, acceleration, and climbing rate. Compared to human-crewed aircraft such as helicopters and small fixed-wing aircraft, multi-rotor drones are more stable and reliable, and have several advantages for remote sensing applications.
2. UAV Platforms and Sensors
The hardware of a UAV remote sensing platform consists of two parts: the flight platform of the drone, and the sensors they are equipped with. Compared to remote sensing satellites, one of the most significant advantages of UAV remote sensing is the flexible replacement of sensors, which allows researchers to use the same drone to study the properties and characteristics of different objects by using different types of sensors. Figure 1 shows this sections’ structure, including the drone’s flight platform and the different types of sensors carried.
Figure 1. UAV platforms and sensors.
2.1. UAV Platform
UAVs have been increasingly employed as a remote sensing observation platform for near-ground applications. Multi-rotor, fixed-wing, hybrid UAVs, and unmanned helicopters are the commonly used categories of UAVs. Among these, multi-rotor UAVs have gained the most popularity, owing to their numerous advantages. These UAVs, which come in various configurations, such as four-rotor, six-rotor, and eight-rotor, offer high safety during takeoff and landing and do not require a large airport or runway. They are highly controllable during flight and can easily adjust their flight altitude and speed. Additionally, some multi-rotor UAVs are equipped with obstacle detection abilities, allowing them to stop or bypass obstacles during flight. Figure 2 shows four typical UAV platforms.
Figure 2. UAV platforms: (a) Multi-rotor UAV, (b) Fixed-wing UAV, (c) Unmanned Helicopter, (d) VTOL UAV.
Multi-rotor UAVs utilize multiple rotating propellers powered by brushless motors to control lift. This mechanism enables each rotor to independently and frequently adjust its rotation speed, thereby facilitating quick recovery of flight altitude and attitude in case of disturbances. However, the power efficiency of multi-rotor UAVs is not prominent, and their flight duration is relatively short. Common consumer grade drones, after carefully optimizing their weight and power, have a duration of about 30 min; for example, DJI’s Mavic Pro has a flight range of 27 min, Mavic 2 has a range of 31 min, and Mavic Air 2 has a range of 34 min. Despite these limitations, multi-rotor UAVs have been extensively used as remote sensing data acquisition platforms in the reviewed literature.
Fixed-wing UAVs, which are similar in structure to common aircraft, generate lift force from the upper and lower air pressure generated by their fixed wings during forward movement. These UAVs require a runway for takeoff and landing, and their landing process is more challenging to control than that of multi-rotor UAVs. The stable flight of fixed-wing UAVs necessitates that the wings provide more lift than the weight of the aircraft, requiring the UAV to maintain a certain minimum speed throughout its flight. Consequently, these UAVs cannot hover, and their response to rising or falling airflow is limited. While the flight speed of fixed-wing UAVs is superior to that of multi-rotor UAVs, their flight duration is also longer.
Unmanned helicopters, which have a structure similar to helicopters, employ a large rotor to provide lift and a tail rotor to control direction. These UAVs possess excellent power efficiency and flight duration, but their mechanical blade structure is complex, leading to high vibrations and costs. Nonetheless, limited research work on using unmanned helicopters as a remote sensing platform was reported in the reviewed literature.
Hybrid UAVs, also known as vertical take-off and landing (VTOL), combine the features of both multi-rotor and fixed-wing UAVs. These UAVs take off and land in multi-rotor mode and fly in fixed-wing mode, providing the advantages of easy control during takeoff and landing and energy-saving during flight.
2.2. Sensors Carried by UAVs
UAVs have been widely utilized as a platform for remote sensing, and the sensors carried by these aircraft play a critical role in data acquisition. Among the sensors commonly used by multi-rotor UAVs, there are two main categories: imagery sensors and three-dimensional information sensors. In addition to the two types of sensor that are commonly used, other types of sensors carried by drones include gas sensors, air particle sensors, small radars, etc. Figure 3 shows four typical UAV-carried sensors.
Figure 3. Sensors carried by UAVs: (a) RGB Camera, (b) Multi-spectral Camera, (c) Hyper-spectral Camera, (d) LIDAR.
Imagery sensors capture images of the observation targets and can be further classified into several types. RGB cameras capture images in the visible spectrum and are commonly used for vegetation mapping, land use classification, and environmental monitoring. Multi-spectral/hyper-spectral cameras capture images in multiple spectral bands, enabling the identification of specific features such as vegetation species, water quality, and mineral distribution. Thermal imagers capture infrared radiation emitted by the targets, making it possible to identify temperature differences and detect heat anomalies. These sensors can provide high-quality imagery data for various remote sensing applications.
In addition to imagery sensors, multi-rotor UAVs can also carry three-dimensional information sensors. These sensors are relatively new and have been developed in recent years with the advancement of simultaneous localization and mapping (SLAM) technology. LIDAR sensors use laser beams to measure the distance between the UAV and the target, enabling the creation of high-precision three-dimensional maps. Millimeter wave radar sensors use electromagnetic waves to measure the distance and velocity of the targets, making them suitable for applications that require long-range and all-weather sensing. Multi-camera arrays capture images from different angles, allowing the creation of 3D models of the observation targets. These sensors can provide rich spatial information, enabling the analysis of terrain elevation, structure, and volume.
2.2.1. RGB Cameras
RGB cameras are a prevalent remote sensing sensor among UAVs, and two types of RGB cameras are commonly used on UAV platforms. The first type is the UAV-integrated camera, which is mounted on the UAV using its gimbal. This camera typically has a resolution of 20 megapixels or higher, such as the 20-megapixel 4/3-inch image sensor integrated into the DJI Mavic 3 aircraft and the 20-megapixel 1-inch image sensor integrated into AUTEL’s EVO II Pro V3 UAV. These cameras can capture high-resolution images at high frame rates, offering the advantages of being lightweight, compact, and having a long endurance. However, they cannot replace the original lens with telephoto and wide-angle lenses, which are required for remote and wide-angle environments.
The second type of camera commonly carried by UAVs is a single lens reflex (SLR) camera, which enables the replacement of lenses with different focal lengths. UAVs equipped with SLR cameras offer the advantage of lens flexibility and can be used for remote sensing or wide-angle observation, making them a valuable tool for such applications. Nonetheless, SLR cameras are heavier and require gimbals for installation, necessitating a UAV with sufficient size and load capacity to accommodate them. For example, Liu et al. [
42] utilized the SONY A7R camera, which provides multiple lens options, including zoom and fixed focus lenses, to produce a high-precision digital elevation model (DEM) in their research.
2.2.2. Multi-Spectral and Hyper-Spectral Camera
Multi-spectral and hyper-spectral cameras are remote sensing instruments that collect the spectral radiation intensity of reflected sunlight at specific wavelengths. A multi-spectral camera is designed to provide data similar to that of multi-spectral remote sensing satellites, allowing for quantitative observation of the radiation intensity of reflected light on ground targets in specific sunlight bands. In processing multi-spectral satellite remote sensing image data, the reflected light intensity data of the same ground target in different spectral bands are used as remote sensing indices, such as the widely used normalized difference vegetation index (NDVI) [
9] dimensionless index, which is defined as in Equation (
1):
In Equation (
1), NIR refers to the measured intensity of reflected light in the near-infrared spectral range (700∼800 nm), while Red refers to the measured intensity of reflected light in the red spectral range (600∼700 nm). The NDVI index is used to measure vegetation density, as living green plants, algae, cyanobacteria, and other photosynthetic autotrophs absorb red and blue light but reflect near-infrared light. Thus, vegetation-rich areas have higher NDVI values.
After the launch of the Landsat-1 satellite in 1972, multi-spectral scanner system (MSS) sensors that can independently observe the ground reflected light according to the frequency range became a research hot spot data source. When dealing with the problem of spring vegetation greening and subsequent degradation in the Great Plains of the Central United States, the studied regional latitude differences are large, so NVDI [
9] was proposed as a spectral index method that is not sensitive to changes of latitude and solar zenith angle. The NDVI index ranges from 0.3 to 0.8 in densely vegetated areas, and the NDVI value range is negative for cloud- and snow-covered areas; for a water body, the NDVI value is close to 0; for bare soil, the NDVI value is a small positive value.
In addition to the vegetation index, other common remote sensing indices include the normalized difference water index (NDWI) [
12], enhanced vegetation index (EVI) [
11], leaf area index (LAI) [
43], modified soil adjusted vegetation index (MSAVI) [
13], soil adjusted vegetation index (SAVI) [
14], and other remote sensing index methods. These methods measure the spectral radiation intensity of blue light, green light, red light, red edge, near-infrared, and other object reflection bands.
Table 1 presents a comparison between the multi-spectral cameras of UAVs and the multi-spectral sensors of satellites. One notable difference is that a UAV’s multi-spectral camera has a specific narrow band known as the “red edge” [
44], which is not present in many satellites’ multi-spectral sensors. This band has a wavelength range of 680 nm to 730 nm, transitioning from the visible light frequencies easily absorbed by plants to the infrared band largely reflected by plant cells. From a spectral perspective, this band represents an area where the reflectance of sunlight of plants changes significantly. A few satellites, such as the European Space Agency(ESA)’s Sentinel-2, have data available in this band. Research on satellite data has revealed a correlation between leaf area index (LAI) [
43] and this band [
45,
46,
47]. LAI [
43] is a crucial variable in predicting photosynthetic productivity and evapotranspiration. Another significant difference between UAV multi-spectral cameras and satellite sensors is the advantage of UAVs’ multi-spectral cameras in spatial resolution. UAV multi-spectral cameras can reach centimeter/pixel spatial resolution, which is currently unattainable by satellite sensors. Centimeter-resolution multi-spectral images have many applications in precision agriculture.
Table 1. Parameters of UAV multi-spectral cameras and several satellite multi-spectral sensors.
Hyper-spectral and multi-spectral cameras are both imaging devices that can capture data across multiple wavelengths of light. However, there are some key differences between these two types of camera. Multi-spectral cameras typically capture data across a few discrete wavelength bands, while hyper-spectral cameras capture data across many more (often hundreds) of narrow and contiguous wavelength bands. Moreover, multi-spectral cameras generally have a higher spatial resolution than hyper-spectral cameras. Additionally, hyper-spectral cameras are typically more expensive than multi-spectral cameras. Table 2 provides a summary of several hyper-spectral cameras and their features and that were utilized in the papers herein reviewed.
Table 2. Parameters of Hyper-spectral Cameras.
The data produced by hyper-spectral cameras are not only useful for investigating the reflected spectral intensity of green plants but also for analyzing the chemical properties of ground targets. Hyper-spectral data can provide information about the chemical composition and water content of soil [
48], as well as the chemical composition of ground minerals [
49,
50]. This is because hyper-spectral cameras can capture data across many narrow and contiguous wavelength bands, allowing for detailed analysis of the unique spectral signatures of different materials. The chemical composition and water content of soil can be determined based on the unique spectral characteristics of certain chemical compounds or water molecules, while the chemical composition of minerals and artifacts can be identified based on their distinctive spectral features. As such, hyper-spectral cameras are highly versatile tools that can be utilized for a broad range of applications in various fields, including agriculture, geology, and archaeology.
2.2.3. LIDAR
LIDAR, an acronym for “laser imaging, detection, and ranging”, is a remote sensing technology that has become increasingly popular in recent years, due to its ability to generate precise and highly accurate 3D images of the Earth’s surface. LIDAR systems mounted on UAVs are capable of collecting data for a wide range of applications, including surveying [
51,
52], environmental monitoring [
53], and infrastructure inspection [
54,
55,
56].
One of the key advantages of using LIDAR in UAV remote sensing is its ability to provide highly accurate and detailed elevation data. By measuring the time it takes for laser pulses to bounce off the ground and return to the sensor, LIDAR can create a high-resolution digital elevation model (DEM) of the terrain. This data can be used to create detailed 3D maps of the landscape, which are useful for a variety of applications, such as flood modeling, land use planning, and urban design.
Another benefit of using LIDAR in UAV remote sensing is its ability to penetrate vegetation cover to some extent, allowing for the creation of detailed 3D models of forests and other vegetation types. Multiple return LIDAR has the ability to measure the return time of different pulses of reflected light emitted at the same time. By precisely using this feature, information on the canopy structure in a forest can be obtained by measuring the different return times. This data can be used for ecosystem monitoring, wildlife habitat assessment, and other environmental applications.
In addition to mapping and environmental monitoring, LIDAR-equipped UAVs are also used for infrastructure inspection and construction environment monitoring. By collecting high-resolution images of bridges, buildings, and other structures, LIDAR can help engineers and construction professionals identify potential problems. Figure 4 shows mechanical scanning and solid-state LIDAR.
Figure 4. LIDAR: (a) Mechanical Scanning LIDAR, (b) Solid-state LIDAR.
LIDAR technology has evolved significantly in recent years with the emergence of solid-state LIDAR technology, which uses an array of stationary lasers and photodetectors to scan the target area. Solid-state LIDAR technology offers several advantages over mechanical scanning LIDAR, which use a rotating mirror or prism to scan a laser beam across the target area. Solid-state LIDAR is typically more compact and lightweight, making it well suited for use on UAVs.
3. UAV Remote Sensing Data Processing
UAV remote sensing has several advantages compared with satellite remote sensing: (1) UAV remote sensing can be equipped with specific sensors for observation, as required. (2) UAV remote sensing can observe targets at any time period allowed by weather and environmental conditions. (3) UAV remote sensing can set a repeatable flight route, to achieve multiple target observations from a set altitude and angle. (4) The image sensor mounted on the UAV is closer to the target, and the image resolution obtained by observation is higher. These characteristics have not only allowed the remote sensing community to produce new techniques in land cover/land use and change detection based on remote sensing satellite data in the past, but have also contributed to the growth of forest remote sensing, precision agriculture remote sensing, and other research directions.
3.1. Land Cover/Land Use
Land cover and land use are fundamental topics in satellite remote sensing research. This field aims to extract information about ground observation targets from low-resolution image data captured by early remote sensing satellites. NASA’s Landsat series satellite program is the longest-running Earth resource observation satellite program to date, with 50 years of operation since the launch of Landsat-1 [
57] in 1972.
In the early days of remote sensing, land use classification methods focused on identifying and classifying the spectral information of pixels covering the target object, known as sub-pixel approaches [
58]. The concept of these methods is that the spectral characteristics of a single pixel in a remote sensing image are based on the spatial average of the spectral signatures reflected from multiple object surfaces within the area covered by that pixel.
However, with the emergence of high-resolution satellites, such as QuickBird and IKONOS, which can capture images with meter-level or decimeter-level spatial resolution, the industry has produced a large amount of high-resolution remote sensing data with sufficient object textural features. This has led to the development of object-based image analysis (OBIA) methods for land use/land cover.
OBIA uses a super-pixel segmentation method to segment the image and then applies a classifier method to classify the spectral features of the segmented blocks and identify the type of ground targets. In recent years, neural network methods, especially the full convolution neural network (FCN) [
59] method, have become the mainstream methods of land use and land cover research. Semantic segmentation [
23,
60,
61] and instance segmentation [
24,
62,
63] neural network methods can extract the type, location, and spatial range information of ground targets end-to-end from remote sensing images.
The emergence of unmanned aerial vehicle (UAV) remote sensing has produced a new generation of data for land cover/land use research. The image sensors carried by UAVs can acquire images with decimeter-level, centimeter-level, or even millimeter-level resolution, allowing the problem of information extraction for small objects on the ground, which were previously difficult to study, to become a new research interest, such as people on the street, cars, animals, and plants.
Researchers have proposed various methods to address these challenges. For instance, PEMCNet [
64], an encoder–decoder neural network method proposed by Zhao et al., achieved good classification results for LIDAR data taken by UAVs, with a high accuracy for ground objects such as buildings, shrubs, and trees. Harvey et al. [
65] proposed a terrain matching system based on the Xception [
66] network model, which uses a pretrained neural network to determine the position of the aircraft without relying on inertial measurement units (IMUs) and global navigation satellite systems (GNSS). Additionally, Zhuang et al. [
67] proposed a method based on neural networks to match remote sensing images of the same location taken from different perspectives and resolutions, called multiscale block attention (MSBA). By segmenting and combining the target image and calculating the loss function separately for the local area of the image, the authors realized a matching method for complex building targets photographed from different angles.
3.2. Change Detection
Remote sensing satellites can observe the same target area multiple times. Comparing the images obtained from two observations, we can detect changes in the target area over time. Change detection using remote sensing satellite data has wide-ranging applications, such as in urban planning, agricultural surveying, disaster detection and assessment, map compilation, and more.
UAV remote sensing technology allows for data acquisition from multiple aerial photos taken at different times along a preset route. Compared to other types of remote sensing, UAV remote sensing has advantages in spatial resolution and data acquisition for change detection. Some of the key benefits include: (1) UAV remote sensing operates at a lower altitude, making it less susceptible to meteorological conditions such as clouds and rain. (2) The data obtained through UAV remote sensing are generated through structure-from-motion and multi-view stereo (SfM-MVS) and airborne laser scanning (ALS) methods, which enable the creation of a DEM for the observed target and adjacent areas, allowing us to monitor changes in three dimensions over time. (3) UAVs can acquire data at user-defined time intervals by conducting multiple flights in a short time.
Recent research on change detection based on UAV remote sensing data has focused on identifying small and micro-targets, such as vehicles, bicycles, motorcycles, and tricycles, and tracking their movements using UAV aerial images and video data. Another area of research involves the practical application of UAV remote sensing for detecting changes in 3D models of terrain, landforms, and buildings.
For instance, Chen et al. [
68] proposed a method to detect changes in buildings using RGB images obtained from UAV aerial photography and 3D reconstruction of RGB-D data. Cook et al. [
69] compared the accuracy of 3D models generated using a SfM-MVS method and LIDAR scanning measurement for reconstructing complex mountainous river terrain areas, with a root-mean-square error (RMSE) of 30∼40 cm. Mesquita et al. [
70] developed a change detection method, which was tested on the Oil Pipes Construction Dataset(OPCD) and successfully detected construction traces from multiple pictures taken by UAV at different times in the same area and space. Hastaouglu et al. [
71] monitored three-dimensional displacement in a garbage dump using aerial image data and the SfM-MVS method [
41] to generate a three-dimensional model. Lucieer et al. [
72] proposed a method for reconstructing a three-dimensional model of landslides in mountainous areas based on unmanned aerial vehicle multi-view images using the SfM-MVS method. The measured horizontal accuracy was 7 cm, and the vertical accuracy was 6 cm. Li et al. [
73] monitored the deformation of the slope section of large water conservancy projects using UAV aerial photography and achieved a measurement error of less than 3 mm, which was significantly higher than traditional aerial photography methods. Han et al. [
74] proposed a method of using UAVs to monitor road construction, which was applied to an extended road construction site and accurately identified changed ground areas with an accuracy of 84.5∼85%. Huang et al. [
75] developed a semantic detection method for changes in construction sites, based on a 3D point cloud data model generated from images obtained through UAV aerial photography.
3.3. Digital Elevation Model (DEM) Information
In recent years, the accurate generation of digital elevation models (DEM) has become increasingly important in remote sensing landform research. DEMs provide crucial information about ground elevation, including both digital terrain models (DTM) and digital surface models (DSM). A DTM represents the natural surface elevation, while a DSM includes additional features such as vegetation and artificial objects. There are two primary methods for calculating elevation information: structure-from-motion and multi-view stereo (SfM-MVS) [
41] and airborne laser scanning (ALS).
Among the reviewed articles, the SfM-MVS method gained more attention due to its simple requirements. Sanz-Ablanedo et al. [
76] conducted a comparative experiment to assess the accuracy of the SfM-MVS method when establishing a DTM model in a complex mining area covering over 1200 hectares (
1.2×1071.2×107 m
2). The results showed that when a small number of ground control points (GCPs) were used, the root-mean-square error (RMSE) of the checkpoint was plus or minus five times the ground sample distance (GSD), or about 34 cm. In contrast, when more GCPs were used (i.e., more than 2 GCP in 100 images), the RMSE of the checkpoint response converged to twice the GSD, or about 13.5 cm. Increasing the number of GCPs had a significant impact on the accuracy of the 3D-model generated by the SfM-MVS method. It is worth noting that the authors used a small fixed-wing UAV as their remote sensing platform. Rebelo et al. [
77] proposed a method to generate a DTM by taking RGB images from multi-rotor UAVs. The authors used an RGB sensor carried by a DJI Phantom 4 UAV to take images within an area of 55 hectares (
5.5×1055.5×105 m
2) and established a 3D point cloud DTM through the SfM-MVS method. Although the GNSS receiver used was the same model, the horizontal RMSE of the DTM was 3.1 cm, the vertical RMSE was 8.3 cm, and the comprehensive RMSE was 8.8 cm. This precision was much better than that of the fixed-wing UAV method of Sanz-Ablanedo et al. [
76]. In another study, Almeida et al. [
78] proposed a method for qualitative detection of single trees in forest land based on UAV remote sensing RGB data. In their experiment, the authors used a 20-megapixel camera carried by a DJI Phantom 4 PRO to reconstruct a DTM in the SfM-MVS mode of Agisoft Metashape, over an area of 0.15 hectares. For the DTM model obtained, the RMSE of GCPs in the horizontal direction was 1.6 cm, and that in the vertical direction was 3 cm. Hartwig et al. [
79] reconstructed different forms of ravine using SfM-MVS based on multi-view images captured by multiple drones. Through experiments, the authors verified that, even without using GCP for geographic registration, SfM-MVS technology alone could achieve a 5% accuracy in the volume measurement of ravines.
In airborne laser scanning (ALS) methods, Zhang et al. [
53] proposed a method to detect ground height in tropical rainforests based on LIDAR data. This method involved scanning a forest area with airborne LIDAR to obtain three-dimensional point cloud data. Local minima were extracted from the point cloud data as candidate points, with some of these candidates representing the ground between trees in the forest area. The DTM generated by the method had high consistency with the ALS-based reference, with a RMSE of 2.1 m.