Unmanned Aerial Vehicles (UAVs), also known as drones, have advanced greatly in recent years. There are many ways in which drones can be used, including transportation, photography, climate monitoring, and disaster relief. The reason for this is their high level of efficiency and safety in all operations. While the design of drones strives for perfection, it is not yet flawless. When it comes to detecting and preventing collisions, drones still face many challenges. In this context, this research describes a methodology for developing a drone system that operates autonomously without the need for human intervention. This research applies reinforcement learning algorithms to train a drone to avoid obstacles autonomously in discrete and continuous action spaces based solely on image data. The research compare three different reinforcement learning strategies—namely, Deep Q-Networks (DQN), Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC)—that can assist in avoiding obstacles, both stationary and moving The novelty of this research lies in its comprehensive assessment of the advantages, limitations, and future research directions of obstacle detection and avoidance for drones, using different reinforcement learning techniques. The findings could have practical implications for the development of safer and more efficient drones in the future.
UAVs are reshaping the aviation industry and have emerged as the potential successors of conventional aircraft. Without the assistance of a pilot, they are proving to be extremely convenient when it comes to reaching the most remote regions. Today, drones are available in a variety of shapes, sizes, ranges, specifications, and equipment. They serve various purposes, such as transporting commodities, taking photos and videos, monitoring climate change, and conducting search operations after natural disasters [1]. The development of UAVs has a significant impact on the market and economy and a variety of industries. Despite drones being available for a long time, research on their abilities for autonomous obstacle avoidance has only recently gained attention from the scientific community. Humans are heavily involved in most drone practices. Drones with autonomous capabilities, however, can be extremely useful in emergencies since they provide immediate situational awareness and direct response efforts without the need to have a pilot on-site. Security and inspection issues can also be addressed using autonomous drone systems. With predefined GPS coordinates for the departure and destination, researchers initially focused on self-navigation drones that could determine the optimal route and arrive at a predetermined location without human assistance [2]. Despite the GPS navigation claim of collision avoidance, it is still possible for a drone to collide with a tree, a building, or another drone during its flight.
Different strategies have been implemented in vision-based navigation systems to tackle the problem of obstacle detection and avoidance. The most popular ones, among others, are based on geometric relations, fuzzy logic, potential fields, and neural networks [3]. In recent years, the field of deep reinforcement learning (DRL) has experienced exponential growth in both its research and applications. Reinforcement learning (RL) can address a variety of complicated decision-making tasks that were previously beyond the capacity of a machine to function like a human and solve problems in the real world. The ability of autonomous drones to detect and avoid obstacles with a high degree of accuracy is considered a challenging task. Applying reinforcement learning in averting collisions can provide a model which can dynamically adapt to the environment. In RL-based techniques, the agent attempts to learn in a virtual world that is analogous to a real-world setting, and then the trained model is applied to a real drone for testing [4]. When training RL models for effective collision avoidance, minimising the gap between the real world and the training environment is imperative. There should be a close correlation between the training environment and the real world. Different obstacle avoidance scenarios should be taken into account during training. Choosing the best reinforcement learning model is essential to obtaining the best result.
Autonomous Drone Based on Computer Vision and Geometric Relations [3][5][6][7][8][9][10]: Early research in autonomous drone navigation focused on computer vision algorithms and geometric relationships to detect and navigate around obstacles. Various approaches, including real-time obstacle detection algorithms and adaptive navigation algorithms based on geometric relationships, were explored. These methods, although computationally inexpensive, had limitations in fully addressing the navigation problem. Researchers also investigated techniques using SIFT detectors, SURF feature matching, and template matching for obstacle identification, emphasizing the challenges of frontal collision detection with monocular imagery.
Autonomous Drone Based Supervised Learning [11][12][13]: Supervised learning methods for drone navigation involved training models to recognize patterns and relationships between input data and output labels. Examples include control algorithms for drones navigating forest environments and object detection algorithms using deep neural networks (DNN) and computer vision. While these approaches showed improvements in obstacle-distance estimation accuracy and computational efficiency, they often relied heavily on specific features and preplanned GPS tracks.
Autonomous Drone Based on Reinforcement Learning [4][14][15][16][17]: Reinforcement learning (RL) emerged as a promising approach for drone navigation, allowing agents to interact with an environment, optimize behaviour, and adapt to dynamic scenarios. Researchers explored RL algorithms such as Nature-DQN, Dueling-DQN, Deep Deterministic Policy Gradient (DDPG), and soft actor-critic methods. The papers highlight studies comparing RL algorithms in discrete and continuous action spaces, demonstrating their effectiveness in training drones for obstacle avoidance. However, challenges remain in scaling RL algorithms to handle large state spaces and addressing the complexity of navigating around dynamic obstacles.
This entry is adapted from the peer-reviewed paper 10.3390/drones7040245