DRL-Based Load-Balancing Routing Scheme for 6G Space–Air–Ground Integrated-Networks: Comparison
Please note this is a comparison between Version 3 by Camila Xu and Version 4 by Camila Xu.

Due to the rapid development of air-ground integrated network (SAGIN), satellite communication systems have the advantages of wide coverage and low geographical environment requirements, and are gradually becoming the main competitive technology of 6G. Low Earth orbit (LEO) satellite network has the characteristics of low transmission delay, low propagation loss and global coverage, and its exploration has become the main research object of contemporary satellite communication.

  • low earth orbit
  • satellite routing algorithm
  • deep reinforcement learning

1. Introduction

In recent years, the air-ground integrated network has developed rapidly. Satellite communications are a key link in the 6G Space-Ground Integrated Network (SAGIN). It can make up for the shortcomings of 5G terrestrial networks, improve network coverage, and ensure the fault tolerance of the system. It can also combine artificial intelligence, big data, Internet of Things and other technologies to provide users with diversified services. Satellite communications cover a larger area and are more globally adaptable than traditional terrestrial networks. It is gradually becoming the main competitive technology for the next generation of communications. The importance of low Earth orbit (LEO) in satellite communications cannot be overstated. Compared to geosynchronous and medium-Earth orbit constellations, LEO stands out for its low transmission delay, low propagation loss, and ability to cover the world. These unique features make LEO an attractive choice for a variety of applications, including internet services, global positioning systems and remote sensing. LEO's low propagation delay and low propagation loss make it ideal for time-sensitive applications such as real-time communications, while its global coverage ensures it is suitable for applications that require connectivity in remote or hard-to-reach locations. Therefore, it is not surprising that LEO has gained great attention and interest in recent years, leading to the development of new technologies and algorithms to improve its performance and efficiency. With the rapid development of SAGIN, the traditional terrestrial communication network can no longer adapt to future development. The development of satellite communications in low Earth orbit is already a promising development direction.
In a low-Earth orbit satellite network, inter-satellite links (ISLs) ensure communication between satellites. Compared to terrestrial communication networks, LEO satellite networks change their topology more frequently, have longer inter-satellite link delays, and have more frequently changing link states in multi-user areas. Due to high-speed dynamic changes, the cost of traditional path selection methods increases significantly. Therefore, routing protocols for terrestrial applications are difficult to use directly in LEO satellite networks. LEO satellite routing technology is also a supporting technology for the integration of 6G SAGIN remote sensing, communication and computing. Therefore, it is necessary to study routing algorithms in low-orbit satellite networks.
Most of the existing satellite routing algorithms are developed based on terrestrial network routing algorithms. Most of these algorithms are based on the shortest path. Due to the difference in satellite density between high and low latitudes and the difference in user distribution density [1], the load difference between satellites of the same constellation is large. In addition, with the high-speed movement of satellites, the high-load coverage area between satellites also changes rapidly. Therefore, traditional routing algorithms have encountered difficulties in meeting the current development of satellite networks.
Deep learning algorithms have better cognitive performance, reinforcement learning algorithms have stronger decision-making capabilities, and deep reinforcement learning combines the two. In deep reinforcement learning, agents make decisions through interaction with the environment and obtain feedback through trial and error, learning to maximize rewards and minimize penalties [2]. Due to the powerful perception and decision-making ability of deep reinforcement learning, more and more scholars apply this learning to computer vision [3][,4], speech recognition, and automatic driving [4][,5] and other fields. Deep reinforcement learning is also applicable in the field of LEO satellite networks. It is aware of topology changes, load changes, and network parameters such as latency and bandwidth in satellite networks. It can make the best decisions based on network service needs.

2. Low Earth orbit satellite routing

The LEO satellite constellations can be divided into two categories, namely the Walker Delta constellation based on inclined orbit and the Walker constellation based on polar orbit. As shown in Figure 1, the iridium constellation [6] [9is a typical polar-orbiting constellation that has been studied by many scholars as a low-orbit constellation because of its representative constellation structure and easy-to-construct mathematical model.
Figure 1
Illustration of the Iridium constellation.
The orbits and topology of the Iridium constellation are shown in Figure 2. Due to changes in satellite mobility and connectivity, the topology changes rapidly. Satellites operating in reverse orbit cannot establish communication links. In addition, when the satellite passes through the pole, the communication link changes. Due to the challenges brought by dynamic topology, satellite network routing algorithms have attracted a lot of research interest. The work in this field is mainly divided into the following two types, centralized routing algorithm and distributed routing algorithm.
Figure 2
Orbit map of the Iridium satellite.
The distributed routing algorithm can adapt to the dynamic scenarios of satellite networks. This is because the algorithm determines the next hop based on the state of neighboring satellites, such as remaining bandwidth and queue utilization. Therefore, when the state of neighboring satellites changes, the algorithm can quickly sense it and quickly decide on a routing strategy based on the dynamic environment. Combined with the design ideas of the existing terrestrial distributed routing algorithm, the author studies the routing method of satellite network in [7][10]. By taking full account of the characteristics of low-Earth orbit satellites, the airborne buffer space was improved. The packets are classified and the corresponding routing method is designed in [8][11].
Unlike distributed routing algorithms, centralized algorithms need to export global information about satellite networks [9][12]. The master control node first collects global information and then performs routing path calculations. Once they have the routing results, they transmit the entire routing policy to the other nodes. The authors designed an improved distributed hierarchical routing protocol (DHRP) for satellite networks in [10]3]. The protocol sets up master nodes and candidate nodes, so it has excellent routing performance compared to traditional discrete relaxation algorithms (DRAs). The authors in [11]4] propose a hybrid global-local load balancing routing (HGL) algorithm. However, it is ineffective when large-scale traffic flows change suddenly. In [12]5], the authors propose a probabilistic ISL routing (PIR) algorithm in which communication delay is used to evaluate path selection performance. The algorithm also takes into account the cost of inter-satellite links.
Although the above algorithms have made great progress in adapting to the dynamics of satellites in low Earth orbit, their failure to account for satellite payloads remains a major flaw.

3. Load balancing of low-Earth orbit satellites

The length of the intersatellite link for LEO satellites varies with the latitude of the satellite. The traditional route path shortest path algorithm design only relies on the path length, resulting in traffic aggregation at higher latitudes [1][,13]6]. Figure 3 shows a 3D schematic of the satellite traffic distribution created using NS3 network simulation software. Black dots represent low-Earth orbiting satellites, while line segments represent inter-satellite links that carry traffic. The thickness and color of each segment correspond to its bandwidth utilization and traffic, respectively. Thicker lines indicate higher traffic, while darker colors indicate higher bandwidth utilization. Notably, the LEO satellite network experiences congestion mainly in high-latitude and densely populated areas. In addition, the uneven distribution of ground gateway stations creates an imbalance in the load of the satellite network. User mobility and global population distribution are also key factors affecting the distribution of traffic flow [14]7]. The high maneuverability of satellites leads to rapid changes in the coverage area of high loads between satellites [15][8,16]9].
Figure 3
Traffic distribution graph.
A path-based load-balanced satellite routing algorithm is proposed to minimize maximum network traffic [14][17]7,20]. The algorithm avoids traffic aggregation at high latitudes by setting up all inter-satellite links with the same path length and giving all paths the same priority. The authors in [[218]] divide the transmission area into heavy load range and light load range, taking into account the relationship between the reverse slot and the gateway station. The overload range uses a congestion indicator to handle uneven traffic flow distribution with the least weighted path. However, this approach requires link-state information for the entire network and cannot make decisions in real time [19][22].  The elastic load balancing (ELB) algorithm is proposed to realize the exchange of congestion information between satellite nodes [20][3,21]4]. As a result, ELB achieved its load balancing goals and avoided traffic congestion. Use the occupancy of the queue to determine whether satellite nodes are idle or busy. When a node is marked as busy, it sends messages to its neighbors to reduce its transmission rate. The TLR algorithm proposed in [22]5] considers both the current state of congestion and the possible state of next-hop congestion. The authors propose an iterative Dijkstra mechanism in [23]6] to select the best transmission path for load-balanced routing. The authors in [24]7] considered link latency to further improve routing performance. When exploring LEO routing algorithms [25]8], cooperative game theory is utilized to balance the trade-off between load and transmission delay. In [26]9], fuzzy theory is used to realize the needs of different users. Transmission cost and route convergence are evaluated as key performance indicators in [27][30]. Combined with track prediction, an on-demand dynamic routing algorithm is proposed. Energy consumption is also taken into account to improve the quality of service for users [28][31].
The existing literature highlights the advantages of LEO routing in terms of load performance [[329][30],33]. However, challenges related to insufficient local optimization and weak dynamic adaptability remain unresolved, which could hinder the development of low-Earth orbit satellite networks.

4. Machine learning-based satellite routing

Complex satellite network environments and dynamic inter-satellite links make satellite routing algorithms difficult to calculate. Reinforcement learning is widely used in a variety of emerging industries due to its unique ability to deal with sequential decision problems [31][34]. Content caching issues were investigated in [32][35]. Q learning algorithms are used in cloud content delivery systems. In [33][36], Q learning is used to identify congested links in the Internet of Things (IoT) to improve fault tolerance. Similarly, Q-learning is used in [34][37] to increase the throughput of wireless sensor networks (WSNs) and solve the problem of energy consumption of devices. In [35][38], in order to solve the optimal allocation of cache, bandwidth and other resources in the Internet of Vehicles, they use deep reinforcement learning to solve the model [36][39]. The authors of the machine learning use deep deterministic policy gradient (DDPG) [37][40] to design a centralized satellite routing algorithm for space-ground integrated networks [38][39][41,42]. The decision-making centre for the proposed strategy is located in the field. The decision-making center obtains the traffic information of the entire network in real time, and sends the routing information to the relevant satellites after making the decision. However, the disadvantage of this strategy is that the delay of satellite communication is very large, and transmission routing decisions cannot be made in time. Increased network burden; Therefore, it is not suitable for mass use [40][43]. A routing algorithm based on multi-agent deep deterministic policy gradient (MADDPG) [41][44] is proposed, which is a routing strategy deployed on each satellite after centralized training, which solves some of the problems of the above centralized routing algorithm. Centralized training methods have limitations in obtaining sufficient data, which becomes increasingly difficult as the scale of the network and the training complexity of deep reinforcement learning algorithms increase.

References

  1. Mohori, M.; vigelj, A.; Kandus, G.; Werner, M. Performance evaluation of adaptive routing algorithms in packet-switched intersatellite link networks. Int. J. Satell. Commun. Netw. 2002, 20, 97–120.
  2. Liu, L.; Feng, J.; Pei, Q.; Chen, C.; Ming, Y.; Shang, B.; Dong, M. Blockchain-Enabled Secure Data Sharing Scheme in Mobile-Edge Computing: An Asynchronous Advantage Actor–Critic Learning Approach. IEEE Internet Things J. 2021, 8, 2342–2353.
  3. Zhang, Y.; Chen, C.; Liu, L.; Lan, D.; Jiang, H.; Wan, S. Aerial Edge Computing on Orbit: A Task Offloading and Allocation Scheme. IEEE Trans. Netw. Sci. Eng. 2023, 10, 275–285.
  4. Chen, C.; Wang, C.; Liu, B.; He, C.; Cong, L.; Wan, S. Edge Intelligence Empowered Vehicle Detection and Image Segmentation for Autonomous Vehicles. IEEE Trans. Intell. Transp. Syst. 2023, 1–12.
  5. Chen, C.; Yao, G.; Liu, L.; Pei, Q.; Song, H.; Dustdar, S. A Cooperative Vehicle-Infrastructure System for Road Hazards Detection With Edge Intelligence. IEEE Trans. Intell. Transp. Syst. 2023, 24, 5186–5198.
  6. Pizzicaroli, J.C. Launching and Building the IRIDIUM® Constellation. In Proceedings of the Mission Design & Implementation of Satellite Constellations; van der Ha, J.C., Ed.; Springer: Dordrecht, The Netherlands, 1998; pp. 113–121.
  7. Henderson, T.; Katz, R. On distributed, geographic-based packet routing for LEO satellite networks. In Proceedings of the Globecom’00-IEEE. Global Telecommunications Conference. Conference Record (Cat. No.00CH37137), San Francisco, CA, USA, 27 November–1 December 2000; Volume 2, pp. 1119–1123.
  8. Svigelj, A.; Mohorcic, M.; Kandus, G.; Kos, A.; Pustisek, M.; Bester, J. Routing in ISL networks considering empirical IP traffic. IEEE J. Sel. Areas Commun. 2004, 22, 261–272.
  9. Liu, L.; Zhao, M.; Yu, M.; Jan, M.A.; Lan, D.; Taherkordi, A. Mobility-Aware Multi-Hop Task Offloading for Autonomous Driving in Vehicular Edge Computing and Networks. IEEE Trans. Intell. Transp. Syst. 2022, 24, 2169–2182.
  10. Gounder, V.; Prakash, R.; Abu-Amara, H. Routing in LEO-based satellite networks. In Proceedings of the 1999 IEEE Emerging Technologies Symposium. Wireless Communications and Systems (IEEE Cat. No.99EX297), Richardson, TX, USA, 12–13 April 1999; pp. 22.1–22.6.
  11. Long, H.; Shen, Y.; Guo, M.; Tang, F. LABERIO: Dynamic load-balanced Routing in OpenFlow-enabled Networks. In Proceedings of the 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), Barcelona, Spain, 25–28 March 2013; pp. 290–297.
  12. Franck, L.; Maral, G. Routing in networks of intersatellite links. IEEE Trans. Aerosp. Electron. Syst. 2002, 38, 902–917.
  13. Mohorcic, M.; Werner, M.; Svigelj, A.; Kandus, G. Adaptive routing for packet-oriented intersatellite link networks: Performance in various traffic scenarios. IEEE Trans. Wirel. Commun. 2002, 1, 808–818.
  14. Kucukates, R.; Ersoy, C. Minimum flow maximum residual routing in LEO satellite networks using routing set. Wirel. Netw. 2008, 14, 501–517.
  15. Cigliano, A.; Zampognaro, F. A Machine Learning approach for routing in satellite Mega-Constellations. In Proceedings of the 2020 International Symposium on Advanced Electrical and Communication Technologies (ISAECT), Virtual, 25–27 November 2020; pp. 1–6.
  16. Wang, H.; Ran, Y.; Zhao, L.; Wang, J.; Luo, J.; Zhang, T. GRouting: Dynamic Routing for LEO Satellite Networks with Graph-based Deep Reinforcement Learning. In Proceedings of the 2021 4th International Conference on Hot Information-Centric Networking (HotICN), Nanjing, China, 25–27 November 2021; pp. 123–128.
  17. Kucukates, R.; Ersoy, C. High performance routing in a LEO satellite network. In Proceedings of the Eighth IEEE Symposium on Computers and Communications, Kemer-Antalya, Turkey, 30 June–3 July 2003; Volume 2, pp. 1403–1408.
  18. Liu, W.; Tao, Y.; Liu, L. Load-Balancing Routing Algorithm Based on Segment Routing for Traffic Return in LEO Satellite Networks. IEEE Access 2019, 7, 112044–112053.
  19. Ju, Y.; Zou, G.; Bai, H.; Liu, L.; Pei, Q.; Wu, C.; Otaibi, S.A. Random Beam Switching: A Physical Layer Key Generation Approach to Safeguard mmWave Electronic Devices. IEEE Trans. Consum. Electron. 2023, 1.
  20. Taleb, T.; Mashimo, D.; Jamalipour, A.; Hashimoto, K.; Nemoto, Y.; Kato, N. SAT04-3: ELB: An Explicit Load Balancing Routing Protocol for Multi-Hop NGEO Satellite Constellations. In Proceedings of the IEEE Globecom 2006, San Francisco, CA, USA, 27 November–1 December 2006; pp. 1–5.
  21. Taleb, T.; Mashimo, D.; Jamalipour, A.; Kato, N.; Nemoto, Y. Explicit Load Balancing Technique for NGEO Satellite IP Networks With On-Board Processing Capabilities. IEEE/ACM Trans. Netw. 2009, 17, 281–293.
  22. Song, G.; Chao, M.; Yang, B.; Zheng, Y. TLR: A Traffic-Light-Based Intelligent Routing Strategy for NGEO Satellite IP Networks. IEEE Trans. Wirel. Commun. 2014, 13, 3380–3393.
  23. Liu, J.; Luo, R.; Huang, T.; Meng, C. A Load Balancing Routing Strategy for LEO Satellite Network. IEEE Access 2020, 8, 155136–155144.
  24. Geng, S.; Liu, S.; Fang, Z.; Gao, S. An optimal delay routing algorithm considering delay variation in the LEO satellite communication network. Comput. Netw. 2020, 173, 107166.
  25. Wei, S.; Cheng, H.; Liu, M.; Ren, M. Optimal Strategy Routing in LEO Satellite Network Based on Cooperative Game Theory. In Proceedings of the Space Information Networks: Second International Conference, SINC 2017, Yinchuan, China, 10–11 August 2017.
  26. Jiang, Z.; Liu, C.; He, S.; Li, C.; Lu, Q. A QoS routing strategy using fuzzy logic for NGEO satellite IP networks. Wirel. Netw. 2018, 24, 295–307.
  27. Pan, T.; Huang, T.; Li, X.; Chen, Y.; Xue, W.; Liu, Y. OPSPF: Orbit Prediction Shortest Path First Routing for Resilient LEO Satellite Networks. In Proceedings of the ICC 2019–2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–6.
  28. Hao, L.; Ren, P.; Du, Q. Satellite QoS Routing Algorithm Based on Energy Aware and Load Balancing. In Proceedings of the 2020 International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, 21–23 October 2020; pp. 685–690.
  29. Zuo, P.; Wang, C.; Wei, Z.; Li, Z.; Zhao, H.; Jiang, H. Deep Reinforcement Learning Based Load Balancing Routing for LEO Satellite Network. In Proceedings of the 2022 IEEE 95th Vehicular Technology Conference: (VTC2022-Spring), Helsinki, Finland, 19–22 June 2022; pp. 1–6.
  30. Xu, Q.; Zhang, Y.; Wu, K.; Wang, J.; Lu, K. Evaluating and Boosting Reinforcement Learning for Intra-Domain Routing. In Proceedings of the 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), Monterey, CA, USA, 4–7 November 2019; pp. 265–273. Ju, Y.; Chen, Y.; Cao, Z.; Liu, L.; Pei, Q.; Xiao, M.; Ota, K.; Dong, M.; Leung, V.C.M. Joint Secure Offloading and Resource Allocation for Vehicular Edge Computing Network: A Multi-Agent Deep Reinforcement Learning Approach. IEEE Trans. Intell. Transp. Syst. 2023, 24, 5555–5569.
  31. Ju, Y.; Chen, Y.; Cao, Z.; Liu, L.; Pei, Q.; Xiao, M.; Ota, K.; Dong, M.; Leung, V.C.M. Joint Secure Offloading and Resource Allocation for Vehicular Edge Computing Network: A Multi-Agent Deep Reinforcement Learning Approach. IEEE Trans. Intell. Transp. Syst. 2023, 24, 5555–5569. Liu, Y.; Lu, D.; Zhang, G.; Tian, J.; Xu, W. Q-Learning Based Content Placement Method for Dynamic Cloud Content Delivery Networks. IEEE Access 2019, 7, 66384–66394.
  32. Liu, Y.; Lu, D.; Zhang, G.; Tian, J.; Xu, W. Q-Learning Based Content Placement Method for Dynamic Cloud Content Delivery Networks. IEEE Access 2019, 7, 66384–66394. Pan, S.; Li, P.; Zeng, D.; Guo, S.; Hu, G. A Q-Learning Based Framework for Congested Link Identification. IEEE Internet Things J. 2019, 6, 9668–9678.
  33. Pan, S.; Li, P.; Zeng, D.; Guo, S.; Hu, G. A Q-Learning Based Framework for Congested Link Identification. IEEE Internet Things J. 2019, 6, 9668–9678. Wei, Z.; Liu, F.; Zhang, Y.; Xu, J.; Ji, J.; Lyu, Z. A Q-learning algorithm for task scheduling based on improved SVM in wireless sensor networks. Comput. Netw. 2019, 161, 138–149.
  34. Wei, Z.; Liu, F.; Zhang, Y.; Xu, J.; Ji, J.; Lyu, Z. A Q-learning algorithm for task scheduling based on improved SVM in wireless sensor networks. Comput. Netw. 2019, 161, 138–149. Qiao, G.; Leng, S.; Maharjan, S.; Zhang, Y.; Ansari, N. Deep Reinforcement Learning for Cooperative Content Caching in Vehicular Edge Computing and Networks. IEEE Internet Things J. 2020, 7, 247–257.
  35. Qiao, G.; Leng, S.; Maharjan, S.; Zhang, Y.; Ansari, N. Deep Reinforcement Learning for Cooperative Content Caching in Vehicular Edge Computing and Networks. IEEE Internet Things J. 2020, 7, 247–257. Tu, Z.; Zhou, H.; Li, K.; Li, G.; Shen, Q. A Routing Optimization Method for Software-Defined SGIN Based on Deep Reinforcement Learning. In Proceedings of the 2019 IEEE Globecom Workshops (GC Wkshps), Big Island, HI, USA, 9–13 December 2019; pp. 1–6.
  36. Tu, Z.; Zhou, H.; Li, K.; Li, G.; Shen, Q. A Routing Optimization Method for Software-Defined SGIN Based on Deep Reinforcement Learning. In Proceedings of the 2019 IEEE Globecom Workshops (GC Wkshps), Big Island, HI, USA, 9–13 December 2019; pp. 1–6. Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.M.O.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971.
  37. Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.M.O.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. Ju, Y.; Yang, M.; Chakraborty, C.; Liu, L.; Pei, Q.; Xiao, M.; Yu, K. Reliability-Security Tradeoff Analysis in mmWave Ad Hoc Based CPS. ACM Trans. Sens. Netw. 2023.
  38. Ju, Y.; Yang, M.; Chakraborty, C.; Liu, L.; Pei, Q.; Xiao, M.; Yu, K. Reliability-Security Tradeoff Analysis in mmWave Ad Hoc Based CPS. ACM Trans. Sens. Netw. 2023. Ju, Y.; Wang, H.; Chen, Y.; Zheng, T.X.; Pei, Q.; Yuan, J.; Al-Dhahir, N. Deep Reinforcement Learning Based Joint Beam Allocation and Relay Selection in mmWave Vehicular Networks. IEEE Trans. Commun. 2023, 71, 1997–2012.
  39. Ju, Y.; Wang, H.; Chen, Y.; Zheng, T.X.; Pei, Q.; Yuan, J.; Al-Dhahir, N. Deep Reinforcement Learning Based Joint Beam Allocation and Relay Selection in mmWave Vehicular Networks. IEEE Trans. Commun. 2023, 71, 1997–2012. Lowe, R.; Wu, Y.; Tamar, A.; Harb, J.; Abbeel, P.; Mordatch, I. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. In Proceedings of the 31st International Conference on Neural Information Processing Systems NIPS’17, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6382–6393.
  40. Lowe, R.; Wu, Y.; Tamar, A.; Harb, J.; Abbeel, P.; Mordatch, I. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. In Proceedings of the 31st International Conference on Neural Information Processing Systems NIPS’17, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6382–6393. Qin, Z.; Yao, H.; Mai, T. Traffic Optimization in Satellites Communications: A Multi-agent Reinforcement Learning Approach. In Proceedings of the 2020 International Wireless Communications and Mobile Computing (IWCMC), Limassol, Cyprus, 15–19 June 2020; pp. 269–273.
  41. Qin, Z.; Yao, H.; Mai, T. Traffic Optimization in Satellites Communications: A Multi-agent Reinforcement Learning Approach. In Proceedings of the 2020 International Wireless Communications and Mobile Computing (IWCMC), Limassol, Cyprus, 15–19 June 2020; pp. 269–273.
More
ScholarVision Creations