Decentralized Multi-Robot Collision Avoidance: Comparison
Please note this is a comparison between Version 2 by Dean Liu and Version 1 by Abdul Hadi Abd Rahman.

When deploying a multi-robot system, it is ensured that the hardware parts do not collide with each other or the surroundings, especially in symmetric environments. Two types of methods are used for collision avoidance: centralized and decentralized. The decentralized approach has mainly been used in recent times, as it is computationally less expensive.

  • decentralized
  • multi-robot
  • collision avoidance

1. Introduction

The use of autonomous mobile robots has increased in various fields of life in recent times. Mobile robots are being used in industry, medicine, search and rescue, and other applications. Most of these applications require a team of robots to fulfill the task efficiently and in less time than their human counterparts [1]. However, they face some issues hindering them from optimal path planning due to the symmetrical shape of the environment. Multiple robots are expected to explore more areas in less time while solving robot localization and collision-avoidance issues. In a scenario where a team of mobile robots is operating, it is necessary to keep them safe from collisions and the surroundings. Collision avoidance for a multi-robot system forms the focus of many current studies. When deploying a multi-robot system, it is ensured that the hardware parts do not collide with each other or the surroundings, especially in symmetric environment. There are two main approaches used to address collision avoidance in multi-robot systems: centralized and decentralized [2]. The centralized approach is efficient only for small groups of robots, and for large groups, the decentralized approach is more effective, as it is less expensive computationally. A decentralized method that uses a triangular grid pattern was introduced by Yang [3], using previously explored maps and information from neighbors to avoid collisions.
The agents need a communication system to inform them about their current position and velocity in a decentralized approach. Various types of communication setups have been introduced in recent years. The Alternating Direction Method of Multipliers (ADMM) used by Rey et al. [4] works so that each agent communicates with the neighboring agents, creating a local coordinate system. The setup can manage transmission from all the agents simultaneously. Being a decentralized network, it can manage frequent setup changes and failures. Rodríguez et al. [5] used the velocity information of each agent to create a cooperative communication setup. The relative velocity of agents was then used to decrease an individual agent’s detection region, resulting in collision avoidance. A scheme suggesting the use of a shared memory-driven device for coordination and communication was presented by Pesce et al. [6]. Each agent utilizes the shared device to learn from the collective private observations and share relevant information with other agents. Deep reinforcement learning and Voronoi cell setup were proposed in Wang et al. [7] and Nguyen et al. [8] to ensure cooperative behavior among the agents. Another approach towards collision avoidance is by using a local or global path planner. These approaches use path-planning strategies such as GMapping [9] to avoid collisions [10,11][10][11]. Socially aware path-planning strategies have gained popularity recently. Avoidance of robot-to-pedestrian collisions, human-like speed of motion, and navigation through dense environments require carefully planned trajectories [12,13][12][13]. When planning a trajectory, the structure of the environment also plays a vital role in collision avoidance. It is essential to explore the environment carefully to maintain a low time and energy cost while reaching the target. Some task planner techniques have been presented to address this issue [14]. However, very few studies have discussed cluttered and confined environments [15,16][15][16].
In the process of exploring the databases for this SLR study, it was observed that most of the studies on collision avoidance are conducted for aerial vehicles [17,18,19][17][18][19], surface vehicles [20[20][21][22],21,22], ships [23,24,25,26,27][23][24][25][26][27] and underwater vehicles [15,28,29][15][28][29] However, the focus of this SLR is only on ground vehicles. By carefully scanning the titles and abstracts of the studies, any study using other than ground vehicles was excluded. In this SLR, 17 studies from 2015 to date are selected after systematic screening. All these studies address the collision-avoidance issue for multi-robots (ground vehicles) directly or indirectly. These studies propose new algorithms such as CADRL or innovate classic techniques such as ORCA to avoid collisions. Although various studies were conducted before 2015, they rarely addressed the issue of collision avoidance in a decentralized manner for confined spaces. Many studies attempted to resolve the said issue for no more than two agents. In this SLR, wresearchers select only those studies which use decentralized approaches for a large group of agents in confined space scenarios. Some studies experimented with up to a hundred agents in different scenarios [2], while others considered techniques such as ECBS for a group of agents [30].

2. Characteristics of Collision Avoidance in Multi-Robot System Techniques

This section discusses RQ2, involving the main characteristics of various collision-avoidance techniques using reinforcement learning (RL) and non-reinforcement learning (non-RL) for a multi-robot system. It was observed that only a few studies address the said issue directly, while others propose a path-planning algorithm that avoids collisions. Each study considered the hardware specifications of the agent to implement the proposed model effectively. Different real-world scenarios were evaluated to narrow the simulation and real transition gap. Following is a summary of each related study. The summary of the technique experimental design, benchmark, strength, and weakness on all selected studies are summarized in Table 3.
Table 3. Synthesis analysis of decentralized collision avoidance.
References Technique No of Agents Experimental Design Benchmark Strength Weakness
Non-RL approaches          
[33][31] Continuous Best-Response Approach 4 Indoor but open space maps ORCA Shorter paths Prolongation issue
[30] Push and rotate with bounded-suboptimal ECBS 4 Open Space ORCA Higher success rate N/A
[39][32] Voronoi cells superimposed with RVO cones 4, 10, 25, 70 Open space ORCA Passive collision avoidance guarantees Not optimized
[40][33] Safe Zone Discrete-ORCA-MPC 4 Open space PID, PB-NC-DMPC Non-holonomic robots Not suitable for the unknown scenario
[41][34] Alturistic coordination 8 Indoor and confined spaces PID Local adjustments capability Ignore low-level control parameters
[36][35] Modified MR-DFS strategy 2 Open space MR-DFS Reduces edge traversals Only static obstacles
[5] Velocity-based detection regions Up to 20 Open space Constant

detection radii
Yield faster

and less

conservative trajectories
[34][36] Spot Auction-based Robotic Collision Avoidance Scheme (SPARCAS Up to 500 Open space M* Prioritization and dynamic handling Not optimized
[42][37] Independent virtual center points 2, 6, 8 Open space Monte Carlo An arbitrary number of agents N/A
RL approaches          
[13] CADRL 2, 4, 6, 8 Open space ORCA Higher path quality Stuck in a dense setting
[37][38] RL-ORCA Up to 42 Open space ORCA Congestion and deadlock capability N/A
[2] Hybrid-RL Up to 100 Open space with obstacles NH-ORCA, RL High success rate and generalization capability Not socially compliant
[12] Multi-sensor Fusion and DRL 4 Open and confined space A*, D* Dense environment Oscillatory behavior in a highly spacious environment
[43][39] Deep Q learning—CNN 4 Open and confined space A*, D*   Unnecessary movement occurs
[44][40] End-to-end DRL policy Team of 3 Open space Ind-PPO, MAPPO Navigate through a dense environment Unable to address deadlock situation
[38][41] CBF-MARL 2 Open space MADDPG Safety guarantees Only static obstacles
For non-RL approaches, eight different studies are selected with various agents in both open and confined spaces. Most studies used ORCA as their benchmark evaluation. Cap et al. [33][31] address the problem of finding a collision-free trajectory for an agent in a dynamic environment. The setup considered is an infrastructure with agents already performing tasks when a new task is assigned to an individual agent. The proposed algorithm implements a token system and plans a global trajectory considering all the agents. Meanwhile, Dergachev et al. [30] suggest coordinating sub-groups of the agents that appear to be deadlocked, using locally confined multi-agent pathfinding (MAPF) solvers. The limitation of this model is its assumption that each agent has prior knowledge of the environment and its failure to perform in uncertain situations. Arul et al. [39][32] used buffered Voronoi cells (BVC) and reciprocal velocity obstacles (RVO) to develop a collision-avoidance method for dense environments. To calculate a local collision-free path for each agent, first, a suitable direction is computed by superimposing BVC and RVO cones together. However, this method does not guarantee deadlock resolution, similar to earlier decentralized methods—more studies focus on alternative approaches to avoid the need for global communication among robots [33,41,45][31][34][42]. Mao et al. [40][33] presented a collision-avoidance approach by considering the non-holonomic constraints of the agents. The proposed method is cheaper than PB-NC-DMPC, as it does not use central coordination or rely on communication among the robots. Another study by Wei et al. [41][34] proposed altruistic coordination where each robot is ready to make concessions whenever in congested situations. It is demonstrated that when robots face a congested situation, they can implement waiting, moving forward, dodging, retreating, and turning-head strategies to make local adjustments. Another approach using robot graph exploration is proposed by Nagavarapu et al. [36][35], in which no direct communication is needed to avoid collisions. A data structure is proposed to provide efficient information exchange. Modified Multi-Robot Depth First Search (MR-DFS) strategy is used to achieve better execution than other tree strategies. Zhang et al. [42][37]. suggest a technique using two cooperative strategies to decrease the effective detection regions of the vehicles, for a random number (large number) of agents, using velocity information. Another approach using prioritization is presented by Das et al. [34][36], where agents intentionally disclose their information in order to become prioritized. Competitive robots take part in spot auctions, where they show their willingness to pay the price to obtain access to the desired location. The results show that the proposed method can manage dynamic arrival without compromising the path-length optimality too much. Zhang et al. [42][37] propose an obstacle-avoidance method that incorporates virtual center points, implemented in a distributed manner, which is set based on the current state of the nearby robots and the agent itself. The stability of the system is proved using a Lyapunov function. Two control modes—the obstacle-free mode and obstacle avoidance mode—are used for robots, which are switched carefully using a direct signal. Researchers have applied several reinforcement learning approaches for decentralized collision avoidance. Chen et al. [13] developed an innovative method that applies deep reinforcement learning to offload the online computation to an offline learning procedure to predict interaction patterns. A value network that uses each agent’s joint configuration is developed to estimate the time to the goal. This value network also finds a collision-free velocity vector by admitting the efficient queries while considering other agents’ motion uncertainty. However, some robots become stuck when the obstacle field is dense such that various traps and dead ends are formed. One effective method to resolve the dead-end issue is presented by [44][40]. Meanwhile, Li et al. [37][38] presented a continuous action space-based algorithm. In this method, only the positions and velocities of nearby agents are observed by each agent. A solution of simple convex optimization with safety constraints from ORCA is implemented to resolve the multi-robot collision-avoidance problem. The training process of the proposed approach is much faster than other RL-based collision-avoidance algorithms. Fan et al. [2] designed a decentralized sensor-level collision-avoidance policy. The movement velocity for an agent’s steering commands is directly drawn from raw sensor measurements. The technique used here is policy-gradient-based reinforcement learning. The said technique is integrated into a hybrid control framework to improve the policy’s robustness and effectiveness. Liang et al. [12] used a depth camera with 2D LIDAR as multiple perception sensors to detect nearby dynamic agents and calculate collision-free velocities. The previously unseen virtual and dense real-world environment is directly transferable from the navigation model learned by the agents. However, in the case of glass or nonplanar surfaces, the sensors fail to perform accurately. Bae et al. [43][39] suggested a combination of Deep q-learning and CNN algorithm. This combination enhances the learning algorithm and analyzes the situation more efficiently. Depending on the given situation, the agents can act independently or collaboratively. The memory regeneration technique is used to reuse experience data and reduce correlations between samples to improve data efficiency. The presented method uses image-processing techniques such as object recognition [46,47][43][44] to obtain the robot’s location. However, an unnecessary movement occurs in an environment where the generated path is simple or without obstacles. Lin et al. [44][40] propose a method with a geometric centroid of the robot team, which avoids collisions while maintaining connectivity using Deep RL. The proposed model can sometimes fail to predict the dead-end scenario effectively, which can cause the agent formation to take extra time to reach the goal. Cai et al. [38][41] suggest a combination of Multi-robot Reinforcement Learning MARL and decentralized Control Barrier Function (CBF) shields based on available local information. They developed a Multi-robot Deep Deterministic Policy Gradient (MADDPG) to Multi-robot Deep Deterministic Policy Gradient with decentralized multiple Control Barrier Functions (MADDPGCBF). Each agent has its unique CBFs according to the proposed approach. These CBFs involve cooperative CBFs and non-cooperative CBFs, which deal with the respective types of agents.

3. Collision-Avoidance Techniques to Solve Coordination Issues

To answer RQ3, wresearcher analyzed the primary works presenting collision-avoidance techniques for multi-robot systems while considering the coordination issues. The techniques studied in this workentry are focused on presenting solutions to the said problem for a group of more than two agents. Many studies on collision avoidance in a multi-robot system made a list when first searched. Most of these studies addressed manipulators or warehouse scenarios, which were unsuitable for this workentry. Studies that addressed the type of agents other than ground robots were excluded, which resulted in around 121 studies. These studies were further screened by excluding those that focused on aspects of the multi-robot system other than collision avoidance, such as trajectory optimization or navigation issues [10,49][10][45]. Studies that addressed only the coordination issue without considering collision avoidance were excluded [7]. The final screening resulted in 17 carefully selected studies for relevance to the topic, each presenting a unique collision-avoidance strategy. Collision avoidance is an essential part to be considered when dealing with multi-robot systems. It includes collisions with obstacles and among the agents. Two types of methods most often used are centralized and decentralized approaches. The decentralized approach is computationally inexpensive and enables the agent to be more independent. Another division is classical and reactive approaches [50][46]. However, thires reviewearchers provides information about effective decentralized techniques, classical or reactive. This reviewentry is designed according to the guidelines provided by Okoli [31][47], which resulted in the final selection of 17 papers. All the studies included are focused on finding and developing an effective strategy to avoid collisions in a multi-robot system. Several methods, including deep reinforcement learning, fuzzy logic, and supervised learning, are used as a base to validate or apply the proposed strategies. Most of the studies are focused on application in an open-space environment with static or dynamic obstacles, while confined-space scenarios are not often studied. Environments such as AC ducts and sewers need to be explored even more, as these environments offer different types of hurdles for a multi-robot system compared to open-space environments. Only 2 of the 17 studies addressed the deadlock situation [2,5][2][5], which can appear when agents need to swap their positions or cross a narrow entrance, while others fail to perform in such scenarios [45][42]. Some of the crucial criteria for decentralized multi-robot collision avoidance are summarized as the following: Coordination strategy: Several efficient coordination systems have produced successful collision avoidance for multi-robot systems. The velocity obstacle method allows robots to transmit and receive each other’s states and intentions via an altruistic coordination network. A token-passing technique based on a synchronized shared memory holds all robots’ current trajectories, which learns a value function that completely encodes collaborative behaviors. By utilizing the processed data from the LIDAR sensor, agents coordinate with others via a robot team that works as a centroid or beacon to exchange information among the robots. In the decentralized control barrier function, local information received through the agent’s sensors can be shared by other robots nearby. Traversable region detection region: A unique application of the locally confined multi-robot pathfinding (MAPF) solvers is suggested by Dergachev et al. [30]. This approach presents a way to build a grid-based MAPF instance to avoid deadlocks. Through the learned policies, the robots use local observations of each robot and traversable detection region to collaboratively plan the move to accomplish the team’s navigation task. A proper data-structure-based technique is essential for providing efficient information exchange, as suggested by Nagavarapu et al. [36][35]. Combining 2D multiple perception sensors using LIDAR and depth cameras enabled the agents to sense the dynamic agents in the surroundings and compute collision-free velocities. An approach using Voronoi cells and RVO cones provides an efficient calculation of collision-free direction for each agent. On the other hand, exploiting the velocity information in the proposed technique resulted in less complicated and collision-free trajectories by the traversable region-detection region. Furthermore, incorporating virtual center points implemented in a distributed manner should be set based on the current state of nearby robots and the agent itself. Optimal multi-robot trajectories: While ensuring optimal collision-free navigation, agents are also required to maintain a coordination link within their coordinated network to lower the communication overhead in decentralized systems. One of the essential factors is detecting nearby dynamic agents and calculating collision-free velocities. Early detection and trajectory prediction will result in a higher success rate with less time and trajectory length taken to the goal. Detection of the obstacle’s direction and assuming it to be the leader, the agent can be trained to maintain a predefined distance using formation control. Adaptability: One of the essential criteria for successful application is adaptability, especially in deadlock situations that often occur in multi-robot systems. Few studies faced a problem where agents become stuck when deployed in a dense environment. In some studies, the problem of failing to navigate in an uncertain environment also arose. In some scenarios, only static obstacles are considered while training the agents, making it challenging to address inter-agent collisions. Overall, it is observed that not a single study was able to address all the issues faced when designing a collision-avoidance algorithm. However, deadlock and inter-agent collisions are two main issues that need to be addressed to develop an efficient collision-avoidance model.