Crowd monitoring and analysis is an important evolving applications of unmanned aerial vehicle or drones. From preventing stampede in high concentration crowds to estimating crowd density and to surveilling crowd movements, crowd monitoring and analysis have long been employed in the past by authorities and regulatory bodies to tackle challenges posed by large crowds.
Crowd monitoring and analysis have been becoming vital from public security and safety viewpoints because crowd participants may show abnormal behavior. An increase in crowd density and the associated abnormal behavior amongst its participants may ultimately lead to stampede incidents with risk of injuries. The probability of risks multiplies when coupled with strict spatiotemporal constraints, such as those exercised in religious gatherings 
. Furthermore, in such mass gatherings, potential public health threats are even more severe, ranging from transmission of infectious diseases, thermal disorders, the possibility of terrorism incidents, and violent crowd behaviors resulting from alcohol consumption and/or substance abuse 
. Planned religious gatherings can attract millions of people into designated areas enhancing societal values. For example, Hajj is considered one of the largest planned mass gatherings where over two million Muslims gather annually in Mecca 
. Stampede incidents in Hajj and other mass religious gatherings, such as the Kumbh Mela in India, are partly attributed to abnormal crowd behaviors, resulting in panic and subsequently fatal accidents 
Traditional crowd analysis methods have relied on visual inputs obtained from static or fixed-location cameras that record images or videos, resulting in fixed angle visibility and limited coverage. In addition, fixed visual inputs cannot, and are hence unable to, perform persistent and continuous tracking of moving crowds, unless the deployment of some massive monitoring devices’ network is made in place. In recent years, unmanned aerial vehicles, commonly known as drones have been deployed to perform crowd analyses, complementing the fixed monitoring devices 
. Drones, primarily used for military applications, are now gaining interest in their use to capture footage that would otherwise require the deployment of helicopters and manned aircrafts. Specifically, the use of drones provides the following advantages: (1) ability to be equipped with required sensors and payloads for acquisition of additional metrics other alongside visual data 
, (2) availability of real-time data for crowd dynamics modeling, with the help of powerful onboard processing units for estimating the crowd dynamics 
, and (3) lowering overall operational costs as the same monitoring device can be deployed elsewhere with an effective increase in its coverage 
, as well as reducing the human resource 
. The benefits provided by drone use have already been established in other fields where mobility and aerial access tremendously increase visibility and site access such as in the maritime environment 
, agricultural technologies 
, mining industries 
, and disasters, tsunami, and pandemic management 
The crowd monitoring and analyses discussed in this paper can be further divided into several domains, namely crowd detection 
, crowd counting 
, crowd density estimation 
, crowd tracking 
, and crowd behavior analysis 
. Whereas most works have focused on a single domain, recent results from the literature have discussed algorithms developed to tackle multiple domains such as the one presented in 
, and similar other areas will be further discussed in the proceeding literature.
2. Drone Architecture
Multiple drone architectures exist in the literature, ranging from classification based on functions, classification by weight and size, performance characteristics, and engine types 
. This section discusses the architecture of drones used for crowd monitoring and analyses, covering its build, onboard sensors, communication, and power management strategies.
2.1. Drone Build
Arguably, the vast selection of drone architectures is reduced when used for the monitoring of human crowds. A significant proportion of drones used in the literature point to multi-rotor drones with a vertical take-off and landing (VTOL) mechanism. The choice of multi-rotor VTOL drones is advantageous in several aspects. Firstly, VTOL is preferred to other mechanisms since it does not require additional launching platforms such as runaways or catapults 
. This allows easy configuration and fast deployment for crowd monitoring purposes and requires a smaller deployment area. Secondly, multi-rotor VTOL drones can hover in one place, making them the preferred choice for monitoring as they can be positioned above the crowds 
. This is especially useful for still crowd imaging, as well as continuous crowd monitoring applications. The second choice of architecture presented in the literature, albeit scarce, is the fixed-wing type drones, which have the advantage of longer flight endurance and higher efficiency.
Of the drones addressed in the literature, the majority chose off-the-shelf commercial drones for the said purpose, notably from the manufacturer DJI, which in the year 2020 accounts for more than 70% of the drones market share in the US 
. The choice of off-the-shelf drones is likely since they come as an integrated package that includes drone build, flight controller, mission planning system and data transmission link, allowing for plug-and-play deployment. However, some work mentions utilizing a custom-made drone to allow full access to the underlying hardware requirement, especially to allow onboard image processing. Shao et al. 
describe using a quadrotor with a NVIDIA Jetson TX1 embedded computer to carry out real-time image processing coupled with STM32F427VIT6 based flight controller. Custom hardware deployments allow complete control of the hardware interfacing and remove restrictions due to proprietary software and codes shipped with the commercial drones.
2.2. Visual and Onboard Sensors
No doubt, the presence of a visual sensor is a requirement for crowd-related drone activities. The RGB or visible light type camera is the most widely used, most commonly equipped with commercial drones 
. On top of RGB, the addition of thermal camera is another visual sensor that is either used on its own 
or used in combination with its RGB counterpart 
RGB images have been the gold standards for image processing applications. RGB images often have higher resolution and contain richer details regarding the area being surveyed 
. However, RGB images are often affected by the environment 
, such as its susceptibility to illumination changes or quality degradation due to limited lighting (for example during nighttime captures) 
. Thermal imagery complements these disadvantages by providing heat signature of the objects in the area of interest, regardless of the environmental illumination conditions. However, thermal images often suffer from lower spatial resolution 
Although commercial drones often come with additional onboard sensors such as the Global Positioning System (GPS), most of the crowd-related literature did not describe taking advantage from these sensors. Bhattarai et al. 
have described the use of onboard GPS, in addition to the RGB camera, to geo-locate humans detected from aerial drone surveillance activity. This in turn, has allowed the authors to visualize the location of the detected human on a Geographic Information System (GIS) platform such as Google Earth. Similarly, Singh et al. 
have discussed the use of the onboard inertial measurement unit (IMU) to calculate the drone’s horizontal velocity. In both cases, built-in or mounted onboard sensors were used as an augmentative feature to obtain additional data during the drone operation.
In a single-drone deployment scenario, the primary communication link is bi-directional between a drone and its Ground Control Station (GCS). Commercial drones such as DJI uses proprietary communication protocols such as Ocusync and Lightbridge, in addition to the WiFi transmission system. Open-source protocols also exist for building communication links for custom-made drones. One such protocol is the Micro Air Vehicle link (MAVLink), which employs a lightweight binary serialization protocol 
. As the complexity of operation increases, communication can shift from being centralized GCS to a decentralized drone-to-drone communication 
Most works in the literature concerning aerial crowd monitoring lack discussion regarding the communication architecture, most likely suggesting the use of the built-in communication protocol bundled with the purchased drone. Several works in this domain, however, have discussed additional details in the communication methods. In 
, the authors have presented a framework for persistent crowd tracking dubbed PERCEIVE which consists of a swarm of drones carrying out video surveillance using a novel charging scheduling and mobile ground charging station. The authors cited improvement in efficiency in comparison with a fixed charging station or no charging condition. In a crowd analysis application presented in 
, the authors have described using cloud computing to analyze images on the cloud due to the memory-intensive computation required by the corresponding algorithms. This allows offloading a segment of intensive drone tasks to dedicated hardware which can relay the processing results back to the drone for further action. However, the reliance on this feature might hamper the drone operation in the case of lost internet connectivity. An emerging trend in drone communication is the use of cloud computing or the Internet of Things (IoT) to augment drone operation. This concept has been proposed in the Industrial IoT (IIoT) field such as for optimal power line inspection using UAVs 
. Some architectures have been developed with the focus of getting connected using resourceless or resource-constrained connectivity such as drone-assisted vehicular networks (DAVN) or flying ad-hoc networks (FANET) for IoT-enabled scenarios 
, giving rise to the concept of internet of drones (IoD) 
. This trend has been observed elsewhere in applications requiring transmitting environmental sensor data for constant monitoring, as demonstrated in 
and is expected to have significant integration in the field of crowd monitoring as well.
2.4. Power Management
Whereas drones can perform a task beyond human imagination, the current technology that relies on onboard batteries introduces some limitations to their operation. Specifically, the flight time is constrained to only tens of minutes. The additional payload carried by the drone will further reduce its flight time. Such observations have been described in 
, where a DJI Matrice UAV has shown a reduction of nearly 46% of its flight time when carrying an additional 320 g of camera payload 
. A survey of the literature has indicated two strategies used in enhancing the drone flight time.
Firstly, the use of alternative power supply, allowing additional power for flight without the frequent need for recharging. Secondly, the use of a charging station (CS) either on the ground or on top of a building or vehicle that employs various technologies to recharge an operational drone. The former method is used in the commercial Perimeter 8 multirotor drone (Skyfront), which operates on a hybrid gasoline-electric engine system and has demonstrated around 13 h of flight time over California’s Coastal Range 
. Similarly, the Hybrix 2.1 (Quarternium) multirotor drone has demonstrated over 10 h of flight time using a petrol-electric fuel-injection engine 
. Whereas the use of an advanced engine propulsion system for drone provides enhanced endurance, it will also significantly increase the cost of these drones. However, the availability of these technologies for multi-rotor drones offers promising prospects in crowd monitoring and analyses applications.
Other works have employed the charging station (CS) strategies allowing conventional drones to ensure the continuity of operations. One method discussed in 
automated the swapping of multiple drones to ensure continuity of operation by identifying when a drone needs a recharge, deploying another one to its place, and directing the low-powered drone to a ground CS. The charging is to be conducted contactless to ensure complete automation, removing the need for an operator to remove and install the battery manually. In a similar concept, instead of a fixed ground CS, Trotta et al. 
has described the use of a mobile CS for continuous crowd tracking, in which scheduling of the drone charging can be made dynamically by changing the location of the CS to follow the drone position, yielding higher efficiency and continuous connectivity of service.
Another method discussed has been the ‘battery hot-swapping’, in which a battery on the drone is switched without needing to turn the drone off at the CS (for example, using a robotic arm). In this case, multiple battery backups are made available instead of multiple drones 
. However, an operation time-gap exists while the hot swapping takes place. Other proposed methods include contactless drone charging from the CS using laser beam technology 
while the drone is mid-air. In the context of crowd monitoring, this method will require utmost operational safety concern since the use of high-intensity lasers can prove hazardous to human health and can impose disturbance in urban living areas 
. Photovoltaic (PV) cells harvesting solar energy have also been investigated in the literature to allow charging without the need of requiring the landing maneuvers. However, this technique depends on having adequate solar irradiation, and is mainly suitable for fix-wing typed drones due to the vast surface area available on the drone body for attaching the PV cells 
. For continuous aerial surveillance, one alternative method suggested in the literature is via a tethered drone system 
. In this method, a drone is powered from a ground station using generators or battery packs, effectively providing unlimited endurance 
. In addition, the tethering can provide an additional secured communication uplink/downlink channel. A significant drawback of this method is the apparent cable length restriction which subsequently imposes restrictions on the drone coverage area. One improvement suggested to overcome this limitation is by introducing a network of drones chained via tethering cables with control mechanisms implemented on the ground station 
. Figure 1
summarizes the drone power management strategies.
Figure 1. Drone power management. (a) Solar-powered drones, suitable for fixed-wing UAVs. (b) Hybrid-powered drones, combining energy from electric and fuel cells. (c) Drone-swapping method, with a fixed charging station. (d) Laser-powered on-the-air recharging. (e) Battery hot-swapping, without powering down the drone. (f) Mobile recharging stations, with programmed charging scheduling, and (g) Tethered drones.
Since adding payloads such as cameras and recording equipment will inevitably reduce the drone’s flight time, efficient power management is necessary for monitoring and analysis applications. Table 1 summarizes the drone architecture as discussed in this section.
Table 1. Drone architecture in the literature utilized for crowd monitoring and analysis purpose.
|DJI Phantom 4, DJI Phantom 4 Pro and DJI Mavic
||RGB + Thermal
||512 × 640 (Thermal)
||RGB + Thermal
||1920 × 1080 (RGB)
320 × 240 (Thermal)
|DJI Matrice 100
|DJI Phantom 4 Pro
||336 × 256
||QGroundControl + MAVLink
||640 × 512
|DJI Matrice 100
|DJI Phantom 4, DJIPhantom 4 Pro, DJI Mavic
||1920 × 1080
||720 × 960
|Parrot AR 2.0
||2 × RGB
||1280 × 720 (front facing)
320 × 240 (downward facing)