Unmanned aerial vehicles (UAVs) are widely used in InternetofThings (IoT) networks, especially in remote areas where communication infrastructure is unavailable, due to flexibility and low cost. However, the joint optimization of locations of UAVs and relay path selection can be very challenging, especially when the numbers of IoT devices and UAVs are very large.
1. Introduction
With the trend of seamless connection and supporting vertical services, in 6G networks, there will be a large amount of InternetofThings (IoT) devices deployed in diverse scenarios to carry a wide range of applications, such as data collection and emergency detection ^{[1]}^{[2]}^{[3]}. However, most IoT devices may be deployed in remote areas such as remote suburban and rural areas, even mountains and deserts. In such regions, IoT devices cannot communicate with others directly due to the long distance between them, and infrastructures like base stations (BSs) are usually missing due to high economic costs ^{[4]}^{[5]}^{[6]}. Therefore, it is necessary to deploy flexible and lowcost relays to satisfy the communication demands of IoT devices ^{[7]}. As a promising technology, unmanned aerial vehicles (UAVs) have attracted much attention from wireless communications researchers due to their flexibility and low cost. According to ^{[8]}, the research on UAVs in wireless communications can be divided into three main directions, UAVaided ubiquitous coverage, information dissemination, and relaying. It has been demonstrated that a UAV can be used to extend the coverage of wireless networks, provide services to more users ^{[4]}, and enhance communication performance for remote users in wireless networks ^{[6]}^{[9]}^{[10]}^{[11]}.
Nowadays, many methods have been used to optimize the performance of UAV networks, e.g., convex optimization ^{[5]}, stochastic geometry ^{[12]}, and learningbased strategy ^{[4]}^{[13]}. However, in sixth generation (6G) communication networks, especially in IoT networks, there are five main challenges to the application of traditional optimization technology or learningbased methods.

High performance. In a 6G network, there will be a large amount of image and video monitoring tasks for IoT devices, and transmitting such data requires high communication rates ^{[1]}. It is usually required to jointly optimize the trajectories, relay paths, transmit powers, and so forth of the UAVs and IoT devices to maximize the communication rates. However, this joint optimization is generally not a convex problem, and it is challenging to find the optimal solution quickly ^{[5]}. In addition, traditional optimization methods such as alternating minimization (AM) algorithm usually fall into the false local optimal ^{[14]}.

High efficiency. In the 6G IoT network, the number of IoT devices can be very large, and thus the algorithm’s time complexity should be very low to deal with the largescale optimization problem ^{[15]}. In addition, the ultralow latency requirements of certain 6G services make the algorithm’s execution time significantly affect the quality of service (QoS), examples of which are mobile IoT services ^{[16]}. Therefore, the algorithm should be executed by the system in a very efficient way such that the QoS can be improved. Thus, exploiting traditional optimization and heuristic algorithms is challenging since they usually need a long time to generate a solution, especially when the network scale is large.

High Robustness. As IoT devices may be moving, the algorithm should be robust to small changes in the locations of the IoT device, i.e., the optimization results can be directly inferred from the algorithm without iterationbased execution or retraining. Traditional optimization algorithms need to be executed again as long as the environment changes, resulting in extra delays when the environment is not sTable Unfortunately, using traditional neural network (NN) methods is challenging due to their low generalizability ^{[17]}.

High Scalability. In 6G networks, there will be many periodic hibernations and timetriggered switchon IoT devices ^{[18]}. Thus, the scale of the network can be changed at different times. This requires the algorithm to be scalable to the increasing/decreasing number of users in the network. However, traditional multilayer perception (MLP), convolutional neural networks (CNN), and recurrent neural networks (RNN), even attentionbased transformer network has no such scalability ^{[19]}.

Low complexity. Usually, in UAV networks the optimization algorithm runs on the UAV’s processor. However, it is difficult for UAVs to carry highperformance computing chips due to limitations such as UAVs’ weight and energy consumption. Moreover, in 6G IoT networks, the number of users and UAVs might be very large, leading to a possible increase in the algorithm complexity ^{[5]}. Therefore, the algorithm should have low time complexity to improve the efficiency and low space complexity such that the algorithm can run on the UAV without memory overflow, even when it deals with very large IoT networks.
In recent years, an emerging neural network architecture named the graph neural network (GNN) has gained increasing attention ^{[20]}. GNNs can discover not only features of data but also relationships between them using a graph structure, which significantly improves the power to analyze data ^{[21]}. Therefore, GNN has been successfully applied in many fields, e.g., community detection ^{[22]}, drug design ^{[23]}, and combinatorial optimization ^{[24]}. Furthermore, the message passing mechanism in GNN is highly consistent with distributed optimization algorithms ^{[25]}. Thus, GNNbased optimization algorithms are successful in several areas of wireless communication networks, such as power allocation ^{[19]}, signal detection ^{[26]}, network slicing ^{[27]}, and virtual network function (VNF) design ^{[28]}. Due to the message passing mechanism, GNNs do not need to process data from all users simultaneously but only from each user locally. Therefore, the number of trainable parameters in GNNs is significantly lower than in traditional neural network architectures such as MLPs and CNNs, inducing much lower computational complexity ^{[29]}. This allows GNNs to be easily deployed in UAVs with poor computational and storage capabilities. Moreover, as the message passing of GNN is structureindependent, the GNN has good scalability and can flexibly cope with network scenarios with different numbers of users and topologies.
2. Deployment Optimization for UAV–IoT Networks
In recent years, many works focused on the optimization of the communication quality in IoT networks. Some early studies optimize the location of a single or small number of UAVs ^{[30]}^{[31]}. The joint location optimization of multiUAVs has a broader prospect in practical applications. Taking into account the timeliness, Galkin ^{[32]} uses the more traditional and efficient Kmeans clustering algorithm to plan the location of each UAV and determine the service objects of each UAV with high efficiency but sacrificing some performance. With higher optimization requirements, most studies are proposed based on convex optimization and evolution algorithms.
In the deployment of multiple UAVs, some works obtain local optimal solutions based on the local information. For example, in the research of Huang ^{[33]}, the coverage maximization algorithm is used to find the local optimal solution, which only needs local information. The calculation of the algorithm is simple and can be completed in realtime. Another work from Huang ^{[34]} uses a distributed solution. An efficient subregion partitioning method is proposed to make each UAV serve almost equal traffic demands and minimize the maximum traffic demand of the subregions under the constraints of the traffic demand and the shape of the subregions. Besides, a local search procedure to relocate the UAV using the backtracking line search algorithm is proposed in this work. Solutions based on local information are usually easier to achieve fast solutions, but usually, only locally optimal solutions can be obtained, which can be useful in some specific situations. There is more work using global information to optimize UAV locations.
Since there are many works focusing on linear programming and other convex optimization methods to solve the UAV deployment problem, the work of Cicek ^{[35]} conducts a comprehensive survey of the literature on UAV position optimization. In this work, a general optimization framework is constructed through a general mixed integer nonlinear programming (MINLP) formulation, and the specification of its components is specified. Sabzehali ^{[36]} formulates the UAV locations optimization problem as integer linear programming and proposed a lowcomplexity algorithm. A novelty work from Kang ^{[37]} proposes placement learning based on Gibbssamplingbased (GSB), which gradually learns suboptimal UAV locations by generating a series of sampling for the UAV locations that constitute a Markov chain where the transition probability determined by the maximum and minimum rates of the different configurations placed by the UAV. However, this work is difficult to adapt to dynamic IoT networks, and the convergence speed is highly dependent on the initial UAV locations. The methods based on linear programming and convex optimization usually achieve satisfactory solutions, but the time complexity is high, and they are usually difficult to adapt to dynamic IoT networks. Especially for largescale IoT networks, it is difficult to obtain a better solution in realtime.
Since UAV location optimization problems are often modelled as nonconvex problems such as mixedinteger optimization problems, which is difficult for convex optimization to obtain a better solution in realtime, heuristic algorithms, especially genetic algorithms, are also often used to optimize UAV deployment problems. Košmerl ^{[38]} proposed a method with the genetic algorithm as the core principle for complete coverage with a minimum number of UAVs for Portable Ground Station and Low Altitude Platform. Kalantari ^{[39]} proposed a heuristic algorithm based on particle swarm optimization. In this work, the number of UAVs and their 3D layouts are estimated while the coverage and capacity constraints of the system are satisfied. Otherwise, Plachy ^{[40]} investigated the performance of two joint association and localization methods based on Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The methods based on the evolution algorithm can usually achieve satisfactory results, but it is difficult to obtain a solution in a short time, especially with rapidly changing IoT communication requirements.
With the rapid development of deep learning, more and more scholars use deep learning technology to solve UAVAssisted communicationrelated problems ^{[41]}. In ^{[42]} a CNNbased method is used to optimize the placement of the sensor in the sensors network, and in ^{[43]} DDQN is used to solve the task scheduling problem by RL. The actorcriticbased RL method is proposed to solve the comprehensive offloading problem in satelliteUAVground network ^{[4]}. However, there are relatively few related studies optimizing the location of UAVs through the deep neural network model directly.