Please note this is an old version of this entry, which may differ significantly from the current revision.

Subjects:
Telecommunications

Multi-input multi-output and non-orthogonal multiple access (MIMO-NOMA) Internet-of-Things (IoT) systems can improve channel capacity and spectrum efficiency distinctly to support real-time applications. Age of information (AoI) plays a crucial role in real-time applications as it determines the timeliness of the extracted information. In MIMO-NOMA IoT systems, the base station (BS) determines the sample collection commands and allocates the transmit power for each IoT device.

- deep reinforcement learning
- age of information
- MIMO-NOMA
- Internet of Things

With the development of the Internet of Things (IoT), the base station (BS) can support the real-time applications such as disaster management, information recommendation, vehicle network, smart city, connected health and smart manufacturing by collecting the data sampled by IoT devices [1,2]. However, the amount of sampled data is enormous and the number of IoT devices is usually high; thus, the realization of these IoT applications requires a large bandwidth spectrum [3]. The multi-input multi-output and non-orthogonal multiple access (MIMO-NOMA) IoT can transmit data through the MIMO-NOMA channel to solve these problems, wherein multiple antennas are deployed at the BS to improve the channel capacity and multiple IoT devices access the common bandwidth simultaneously to improve the spectrum efficiency.

The BS collects data during discrete slots in the MIMO-NOMA IoT system. In each slot, a BS first determines the sample collection commands and allocates the transmit power for each IoT device and then sends the corresponding sample collection commands and transmission power to each IoT device. Afterwards, each IoT device determines whether to sample data from the physical world according to their sample collection commands. Then, each IoT device adopts its allocated power to transmit the sampled data to the BS over the MIMO-NOMA channel. In the transmission process, multiple IoT devices transmit the signals of the data by using the same spectrum, and therefore interference exists between different IoT devices. To eliminate the interference, the BS adopts the successive interference cancellation (SIC) technique to decode the signals from each device [4]. Specifically, the BS sorts the power of all received signals in descending order and decodes the signal with the highest received power by considering other signals as interferences. Then, the BS removes the decoded signal from the received signals and resorts the received signals to decode the next signal. The process is repeated until all signals are decoded.

The age of information (AoI) is a metric to measure the freshness of the data, which is defined as the time from the data sampling to the time when the sampled data are received [5]. In the MIMO-NOMA IoT system, the BS needs to receive data, i.e., decode the signals of the data, in a timely manner after they are sampled to provide the real-time applications; thus, a low AoI is critical in MIMO-NOMA IoT systems [6]. Furthermore, the IoT devices are energy-limited. Thus, the MIMO-NOMA IoT system should also keep its energy consumption low to prolong the working time of the IoT devices [7]. Hence, the AoI and energy consumption are two important performance metrics of the MIMO-NOMA IoT system [8]. The sample collection commands and power allocation may affect the AoI and energy consumption of the system. Specifically, for the sample collection commands, if the BS selects more IoT devices to sample, the system will consume more energy because more IoT devices consume energy to sample data. However, if the BS selects less IoT devices to sample, the data transmitted from the unselected IoT devices become obsolete, which may increase the AoI of the system. Hence, the sample collection commands affect both the AoI and energy consumption of the MIMO-NOMA IoT system. For the power allocation, if an IoT device transmits with high power, the signal transmitted by the IoT device will be decoded wherein a significant amount of signals with lower power act upon the interferences in the SIC process, which may lead to a low signal-to-interference-plus-noise ratio (SINR). Otherwise, if an IoT device transmits data with low power, the SINR may also be deteriorated due to the low transmission power. The low SINR causes a low transmission rate, which may cause a long transmission delay and a high AoI of the MIMO-NOMA IoT system. Hence, the power allocation affects the AoI of the MIMO-NOMA IoT system. Moreover, the power allocation affects the energy consumption directly. Thus, the transmission power affects both the AoI and energy consumption of the MIMO-NOMA IoT system. As mentioned above, it is critical to determine the optimal policy including sample collection commands and power allocation to minimize the AoI and energy consumption of the MIMO-NOMA IoT system. There is no work to minimize the AoI in the MIMO-NOMA IoT system, which motivates us to conduct this work. In the MIMO-NOMA IoT system, the allocation of transmission powers has a direct impact on the transmission rate during the SIC process. Additionally, the MIMO-NOMA channel is inherently affected by stochastic noise. Model-based algorithms struggle to construct an accurate model to describe this process, which causes the traditional model-based algorithms unsuitable to solve the problem. Deep reinforcement learning (DRL) is a type of model-free-based method that enables an agent to learn how to make sequential decisions in a complex environment to achieve a specific goal. DRL can learn the near-optimal policy by learning from the interaction between action and the environment (i.e., dynamic stochastic MIMO-NOMA IoT system) [9]. There are some existing studies on DRL-based optimization frameworks in similar systems. In [10], Zhao et al. formalized the joint optimization problem of video frame resolution selection, computation offloading and resource allocation strategy, and proposed a hierarchical reward function based on the DRL algorithm that minimizes energy consumption, maximizes quality of experience (QoE) delay and analyzes the accuracy in the IoT system. In [11], Chen et al. considered a marginalized IoT system and studied the joint caching and computing service deployment (JCCSP) problem for IoT applications driven by perceptual data. An improved method based on twin-delayed (TD) deep deterministic policy gradient (DDPG) was proposed, which achieved significant convergence performance compared to benchmarks. In general, the DRL algorithm is used to solve problems with either continuous or discrete action spaces separately.

In [12], Grybosi et al. proposed the SIC-aided age-independent random access (AIRA-SIC) scheme (i.e., a slotted ALOHA fashion) for the IoT system, wherein the receiver operates SIC to reconstruct the collisions of various devices. In [13], Wang et al. focused on the problem that minimizes the weighted sum of AoI cost and energy consumption in the IoT systems by adjusting the sample policy, and proposed a distributed DRL algorithm based on the local observation of each device. In [14], Elmagid et al. aimed to minimize the AoI at the BS and the energy consumption of the generate status for the IoT devices, formulated an optimization problem based on the Markov decision process (MDP) and then proved the monotonicity property of the value function associated with the MDP. In [15], Li et al. designed a resource block (RB) allocation, modulation-selecting and coding-selecting scheme for each IoT device based on its channel condition to minimize the long-term AoI of the IoT system. In [16], Hatami et al. employed the reinforcement learning to minimize the average AoI for users in an IoT system consisting of users, energy harvesting sensors and a cache-enabled edge node. In [17], Sun et al. aimed to minimize the weighted sum of the expected average AoI of all IoT devices, propulsion energy of unmanned aerial vehicle (UAV) and transmission energy of IoT devices by determining the UAV flight speed, UAV placement and channel resource allocation in the UAV-assisted IoT system. In [18], Hu et al. considered an IoT system wherein the UAVs take off from a data center to deliver energy and collect data from sensor nodes, and then fly back to the data center. They minimized the AoI of the collected data by dynamic programming (DP) and ant colony (AC) heuristic algorithms. In [19], Emara et al. developed a spatio-temporal framework to evaluate the peak AoI (PAoI) of the IoT system, and compared the PAoI under the time-triggered traffic with event-triggered traffic. In [20], Lyu et al. considered a marine IoT scenario, wherein the AoI is utilized to represent the impact of the packet loss and transmission delay. They investigated the relationship between AoI and state estimation error, and minimized the state estimation error by the decomposition method. In [21], Wang et al. investigated the impact of AoI on the system cost which consists of control cost and communication energy consumption of the industrial-Internet-of-Things (IIoT) system. They proved that the upper bound of cost is affected by the AoI. In [22], Hao et al. maximized the sum of the energy efficiency of the IoT devices under the constraints of AoI by optimizing the transmission power and channel allocation in a cognitive radio-based IoT system.

In [23], Yilmaz et al. proposed a user selection algorithm for the MIMO-NOMA IoT system to improve the sum data rate, and adopted the physical layer network coding (PNC) to improve the spectral efficiency. In [24], Shi et al. considered the downlink of the MIMO-NOMA IoT networks and studied the outage probability and goodput of the system with the Kronecker model. In [25], Wang et al. proposed that the resource allocation problem consists of the optimal beamforming strategy and power allocation in the MIMO-NOMA IoT system, wherein the beamforming optimization is solved by the zero-forcing method, and after that the power allocation is solved by the convex functions. In [26], Han et al. proposed a novel millimeter wave (mmWave) positioning MIMO-NOMA IoT system and proposed the position error bound (PEB) as a novel performance evaluation metric. In [27], Zhang et al. considered the massive MIMO and NOMA to study the performance of the IoT system, and calculated the closed-form function for spectral and energy efficiencies. In [28], Chinnadurai et al. considered the heterogeneous cellular network and formulated a problem to maximize the energy efficiency of the MIMO-NOMA IoT system, wherein the non-convex problem was solved based on the branch and reduced-bound (BRB) approach. In [29], Gao et al. considered the mmWave massive MIMO and NOMA IoT system to maximize the weighted sum transmission rate by optimizing the power allocation, and then solved the problem by the convex method. In [30], Feng et al. considered an UAV-aided MIMO-NOMA IoT system and regarded an UAV as the BS. They formulated the problem to maximize the sum transmission rate of the downlink by optimizing the placement of UAVs, beam pattern and transmission power, and then solved the problem by convex methods. In [31], Ding et al. designed a novel MIMO-NOMA system consisting of two different users, wherein user one should be served with strict quality-of-service (QoS) requirement, and user two accesses the channel by the non-orthogonal way opportunistically; thus, the requirement that small packets of user one in the IoT system should be transmitted in time can be met. In [32], Bulut et al. proposed the water cycle algorithm (WCA) based on the energy allocation method for MIMO-NOMA IoT systems. Their simulation results demonstrated that the proposed method performs better than empirical search algorithm (ESA) and genetic algorithm (GA). In [33], Ullah et al. proposed a power allocation algorithm based on DDPG to maximize energy efficiency in MIMO-NOMA next-generation Internet-of-Things (NG-IoT) networks. Their simulation results demonstrated that the proposed method achieved better performance compared with random algorithms and greedy algorithms.

This entry is adapted from the peer-reviewed paper 10.3390/s23249687

This entry is offline, you can click here to edit this entry!