Deep-Learning-Based Cooperative Spectrum Sensing

Deep-Learning-Based Cooperative Spectrum Sensing: Comparison

Please note this is a comparison between Version 1 by yixuan zhang and Version 3 by Rita Xu.

With the rapid development in wireless communication and 5G networks, the rapid growth in mobile users has been accompanied by an increasing demand for the electromagnetic spectrum. The birth of cognitive radio and its spectrum-sensing technology provides hope for solving the problem of low utilization of the wireless spectrum.

随着无线通信和5G网络的快速发展，移动用户的快速增长伴随着对电磁频谱的需求不断增加。认知无线电及其频谱传感技术的诞生为解决无线频谱利用率低的问题带来了希望。

cognitive radio
spectrum sensing
wireless communication

1. Introduction

With the continued increase in demand from global mobile users for information exchange, wireless communication is developing rapidly, in which the large-scale commercial development of the fifth-generation (the 5th generation, 5G) mobile communication is in full swing. The sixth-generation (the 6th generation, 6G) mobile communication technology is beginning to be researched and developed. The birth of 5G not only offers human-centered communication, which provides users with a significantly better communication experience, but also gives rise to many new internet industries such as cloud gaming, virtual reality, etc., [1], which range from telephony, short message service (SMS), images, videos, audio and internet of things (IOT) data.

Nevertheless, along with the rapid development of technology and services and their large-scale popularization, the number of cell phone users has also increased dramatically. The available electromagnetic spectrum, as a crucial and limited resource for wireless communication, is becoming scarce, and the expansion of wireless communication services means the growing demand for electromagnetic spectrum bandwidth [2]. The available spectrum resources planned by the ITU (International Telecommunication Union) are from 9 kHz to 275 GHz, and according to the ITU’s 2006 forecast report on spectrum resources, it is expected that the global demand for the spectrum will be from 1280 to 1720 MHz in 2020. However, with the development of mobile communication services and the increase in user data traffic, the originally predicted spectrum can no longer meet the future demand, and now it can only meet about half of the demand [3]. To summarize, the shortage in electromagnetic spectrum resources, as a key problem of wireless mobile communication nowadays, will restrict the healthy development of wireless communication and become a bottleneck for developing wireless communication technology in the future if not solved effectively.

At present, spectrum resources are managed and allocated by governments worldwide, such as in the United States, where spectrum management and allocation are under the Federal Communications Commission, and in China, where the National Radio Administration manages the spectrum. Moreover, the allocation method is generally “static allocation”. The available spectrum is divided into multiple non-overlapping parts allocated to different users, the users of different frequency bands are called authorized users, and the frequency band is called the authorized frequency band. Although countries have made a rational allocation of the spectrum, after decades of development, it can be pointed out that the utilization rate of each frequency band at different times and in different geographical locations is disappointing. A study by the Federal Communications Commission of the United States found that, among the licensed frequency bands available and in use, spectrum utilization, especially in the hundreds of megabits to 3 GHz band, where the demand for frequencies is very tight, is very low, with the utilization rate of the licensed frequency bands fluctuating between 15 and 85 percent. Only a few frequency bands have high utilization rates, and the vast majority of frequency bands only have low utilization rates. The utilization rate in some regions is only 5%, so the utilization of the spectrum is extremely unbalanced. At present, the global spectrum resources below 1 GHz have been distributed to very few, and the remaining available frequency bands for wireless communication are also quite limited and cannot satisfy the expansion of new services. How to realize the dynamic management of the spectrum and spectrum sharing mechanisms to improve the efficiency of spectrum utilization in each period and each region has become an urgent problem to be solved [4].

Cognitive radio (CR) is a technology that can improve spectrum utilization, and efficient and accurate spectrum detection is the key to its implementation. The concept of cognitive radio was proposed by Dr. Joseph Mitola, the “father of software radio”, in his doctoral dissertation in 1999 [5]. He pointed out that the key to cognitive radio lies in the system’s ability to comprehensively recognize, analyze, learn, and judge all kinds of information in the external radio environment. The system can communicate intelligently with other cognitive devices to realize a dynamic spectrum allocation policy to improve the efficiency of the frequency spectrum and achieve reliable communication. The cognitive radio system consists of four main steps, i.e., spectrum sensing, spectrum analysis, spectrum judgment, and spectrum reconstruction. The working mode of the cognitive radio system is shown in Figure 1 below.

Figure 1. The working mode of cognitive radio system.

Among the four steps, spectrum sensing (SS) is one of the essential components of cognitive radio, which is the core technology and prerequisite for realizing CR applications and constructing cognitive radio networks [6] and ^[6]is also the focus of this paper. Spectrum-sensing technology enhances spectrum use efficiency by continuously monitoring and analyzing the radio spectrum. It promptly identifies and utilizes unoccupied frequency bands when pre-allocated or statically assigned frequencies are unavailable, meeting the demand for spectrum resources. Radio spectrum sensing has a wide range of promising applications and has been the subject of significant research efforts. In the future, it is foreseeable that this technology will be widely used in communications, radar, unmanned aerial vehicles, intelligent transportation, and other fields. In the field of communication, spectrum sensing can be used for spectrum management and resource allocation in communication systems. SS can monitor the utilization of the radio spectrum to detect which primary user bands are free, for better spectrum resource allocation and dynamic spectrum sharing. In the field of radar, spectrum-sensing techniques can monitor and analyze the use of the radar spectrum to achieve optimal utilization and management of the spectrum for radar defense against jamming [7]. In the field of unmanned aerial vehicles (UAVs), spectrum sensing can provide spectrum information for UAV communication and navigation. By sensing the surrounding spectral environment and information in real time, UAVs can select the optimal communication bands and avoid interfering sources. In addition, spectrum sensing can also help UAVs perform environment sensing and obstacle avoidance, improving safety and intelligence [8]. In the field of smart transportation, this technology can monitor the spectrum used by various wireless devices and communication systems in transportation networks and perform dynamic spectrum management to ensure communication quality and reliability for various applications [9]. As can be seen, the development of spectrum-sensing technology is of great strategic importance.

After decades of development, cognitive radio technology and its spectrum-sensing methods can be broadly categorized into two types: single-node spectrum sensing and cooperative spectrum sensing [10]. Single-node spectrum sensing involves independent judgment by a single user, which does not require a complex system structure or data fusion. It provides a relatively simple sensing process but also faces challenges in breaking through the limitations imposed by physical constraints. On the other hand, cooperative spectrum sensing treats multiple sub-users as a group of cooperative entities that share sensing information. This approach aims to achieve more accurate frequency usage detection for primary users (PUs). However, it still encounters several practical challenges, including coordination, energy consumption, latency, security, and participation issues.

In recent years, artificial intelligence has emerged as a popular technology worldwide, and the development of deep learning has introduced new ideas and methods for spectrum-sensing research. Deep learning excels in nonlinear modeling and adaptive learning, significantly boosting spectrum sensing’s detection performance and speed. This, in turn, facilitates more efficient utilization of spectrum resources [11]. Over the past few years, researchers in the field of communications have employed several prominent network models for spectrum sensing, including convolutional neural networks and long short-term memory networks ^[12][13][12,13]. Several published papers have demonstrated the superiority of deep-learning-based spectrum-sensing algorithms compared to traditional approaches. Consequently, numerous researchers have investigated the combination of deep learning neural networks with spectrum-sensing research. For instance, in [14], scholars Kaixuan Du et al. considered the geodetic distances between signals as statistical features and employed deep neural networks (DNNs) for classifying the data based on these distances, achieving spectrum sensing. Additionally, Jianxin Gai et al. proposed a spectrum-sensing method based on residual networks (ResNets) in [15]. This method employed two-branch convolution to improve the feature extraction capabilities, resulting in superior performance compared to traditional spectrum-sensing methods. The enhanced feature extraction capabilities led to higher detection probabilities and lower bit error rates (BERs) than the traditional methods, especially at low SNR ratios.

Currently, several institutions, including the University of Electronic Science and Technology, Harbin Institute of Technology, Beijing University of Posts and Telecommunications, Florida State University, and the University of Surrey, have been actively researching deep learning in the field of spectrum sensing for cognitive radio, and have made significant progress. This respapearchr aims to provide a summary of the current research status, application scenarios, and future development directions of deep-learning-based spectrum-sensing technology.

However, it is crucial to note that although the spectrum-sensing approach based on classical deep learning (DL) networks has shown promising performance, it also has certain limitations. Most of the current research focuses on the combination of deep learning and non-cooperative spectrum sensing. There are drawbacks to this approach, however. First, in non-cooperative spectrum sensing, the availability of large-scale labeled data can be limited as each device typically only has its own sensing results, leading to a lack of data sharing and collaboration. Second, in non-cooperative spectrum sensing, individual devices sense and process independently, which may not meet the computational requirements of deep learning models due to the computational power and energy constraints of the devices.

There has been a growing interest among scholars to explore the application of deep learning to cooperative sensing. Cooperative sensing allows the aggregation of data from multiple devices, resulting in a larger and more diverse dataset that can be used to train deep learning models. This approach offers several advantages. First, cooperative sensing provides real-time spectrum information and environmental change data, enabling deep learning models to adapt and update rapidly to dynamic conditions. Second, it enhances the availability of observational data and multi-source information, which improves the ability of deep learning models to accurately sense and identify interfering sources.

Researchers have started to investigate deep-learning-based cooperative sensing algorithms. For instance, Tan et al. [16] constructed a new 2D dataset of received signals and trained three cooperative spectrum sensing (CSS) schemes using classical convolutional neural networks (CNNs) such as LeNet, AlexNet, and VGG-16. They compared the performance of CSS schemes based on AND, OR, and majority judgment, with the former achieving favorable results. On the other hand, Chen et al. [17] proposed a DNN-based CSS algorithm and introduced the federated learning framework (FLF) into CSS. Their results demonstrated a detection probability of 98.78% and a false detection probability of 1% at an SNR of −15 dB.

2. Deep-Learning-Based Cooperative Spectrum Sensing

All the previously introduced approaches are designed for the study and application of spectrum sensing in the context of non-cooperative spectrum sensing, whereas in real-world application scenarios, there are, in general, multiple PUs or multiple SUs communicating with each other, which can have a lot of complexity and conflict possibilities. Nowadays, there are several modes and applications of cooperative spectrum sensing, which have been introduced earlier. The findings indicate that the hidden node problem has emerged as a prominent problem in cooperative spectrum sensing. This is also true in real-world communication environments, where shadowing and multipath effects may affect the accuracy of sensing results. How to efficiently and rationally allocate resources to each sensing node is also an urgent issue to be addressed. Recently, scholars have started to introduce deep learning and deep reinforcement learning into the field of cooperative spectrum sensing to alleviate or solve the aforementioned problems. For example, deep learning can optimize data fusion by automatically extracting features from the data of multiple sensing nodes, enabling the system to make more accurate judgments. Another example is that, with proper training, deep learning models can better handle and predict uncertainty in dynamic environments. Deep reinforcement learning, on the other hand, can recognize potential security threats or learn to make effective and robust decisions in CSS security scenarios through continual learning and adaptation.

2.1. Applications of Deep Neural Networks in Cooperative Spectrum Sensing

Understanding the pertinent literature from recent years has revealed that ResNet, graph neural networks, and deep neural networks are the most frequently used CNNs in cooperative spectrum sensing. The next section will summarize the research findings of academics in cooperative spectrum sensing from recent years. In 2019, Woongsup Lee et al. considered the case where multiple SUs share sensing information to make state judgments on whether a single PU occupies more than one frequency band. They proposed a CNN-based CSS model in ^[18][95] to improve the sensing performance and stability by considering the state and spectral correlation of the channel and extracting the data space features to be input into the optimized modest-sized CNN. The simulation results show that the method outperforms the traditional CSS methods such as K-out-of-N and SVM. The paper is also the first to apply deep neural networks to CSS. In Ref. ^[19][96], Zhibo Chen et al. applied it to distributed cooperative spectrum sensing by using the sample covariance matrix as a test statistic as an input to the CNN, i.e., collecting sensory information from different SUs for learning. The proposed model is CSS-CNN, and the experiments are performed under channel fading to compare with other CNN and CLDNN methods, which perform adequately at low SNRs. Whereas in ^[20][97], P. Shachi et al. used a CNN in a centralized cooperative spectrum-sensing framework and considered spatio-temporal data, channel shadowing, and spatial correlation, and the CNN model was also trained for decision making through data fusion centers. The results show that the CNN-based framework maintains strong sensing accuracy under noisy conditions. Although the performance and complexity of CNN can be affected by noise, the results are still accurate due to the decision inputs from historical training and local user measurements. Hang Liu ^[21][98] focused his research on the stacking approach of fusion centers in CSS and perceived OFDM signals in CSS. Recurrent correlated feature learning is fed to a CNN, and a training database is built through a bagging strategy. The innovation is that the data fusion center with the stacked generalization approach can better learn the probabilistic prediction of PU states. The experimental results show the advantage over the traditional CSS method by showing both the detection probability and the false alarm probability. The application of ResNet to cooperative spectrum sensing is also currently in progress. The authors Myke D. M. Valadão et al. consider experiments with neural network models placed in a CSS environment in a scenario where the user maintains a dynamic movement within that range. In Ref. ^[22][99], they proposed a cooperative spectrum-sensing approach based on ResNet. The 3D matrix is used as the input to ResNet, conv3D is used for feature extraction, and two additional conventional CNN and RNN models are used for experimental comparison. The experimental results show that the nn-ResNet proposed in thais paper has a higher accuracy of SU detection than the other two deep neural networks at noise power densities ranging from −140 to −120, and with an increase in the number of SUs, the accuracy of this model is also better than the CNN and RNN, with a detection accuracy of 97.1% for an SU number of 20. In ICECAA 2020, D Raghunatha Rao et al. ^[23][100] combined deep residual network and data-cleansing algorithms to design DRN test statistic refinement for a crowd sensor CSS. That is, the fused data in the CSS system is fed into the ResNet as a matrix together with the sensing information, and the data-cleansing algorithm senses the spectral availability of crowd sensors along the data fusion center to complete the statistical refinement of the DRN test. The simulation results show that the proposed model achieves better detection probability than data-cleaning algorithms and hypothesis-thresholding methods. Graph neural networks (GNNs), deep learning models capable of learning and reasoning about graph-structured data ^[24][101], can help sensing devices model node characteristics and support information dissemination and cooperative decision making among nodes in cooperative spectrum sensing due to their ability to represent nodes and edges as vectors or matrices. These advantages provide a new approach and tool for performance improvement in cooperative spectrum sensing. Scholars are beginning to apply GNNs to cooperative spectrum sensing. In 2023, in order to adapt to the dynamically changing radio environment and the problem of modeling hidden node scenes, Dimpal Janu et al. proposed a model of GCN-CSS for modeling dynamic real-world scenes under multiple antennas ^[25][102]. The GCN-CSS method has a lower computational complexity than the CNN method and still improves perceptual performance even with imperfect reporting channels. It is found that the proposed scheme provides the best performance improvement over different algorithms when the number of hidden nodes increases, and the proposed method outperforms traditional algorithms such as CNN, ANN, SVM, and K-means clustering in terms of perceptual performance. The results show that GCN-CSS has significant advantages in complex wireless environments that require dynamic adaptation and strong performance.

2.2. Applications of Deep Reinforcement Learning to Cooperative Spectrum Sensing

Deep reinforcement learning (DRL) is an artificial intelligence method that combines the advantages of deep learning and reinforcement learning ^[26][103]. Its decision making and learning in a dynamic environment fit the needs of CSS. As a result, scholars are now beginning to introduce DRL into CSS to address distributed decision making, synchronization problems, and security and robustness issues in CSS. In Ref. ^[27][104], Shuai Liu et al. considered cooperative spectrum sensing in multi-user scenarios and dynamic environments. Deep multi-user reinforcement learning (DMRL) with deep Q-network ^[28][105] as the underlying framework is proposed, which is capable of determining the optimal action for each state in a relatively large, dynamic, unknown environment. After training the DDQN by training all users individually, each SU performs conflict state determination via acknowledge (ACK) signaling. In simulation experiments, the proposed method demonstrated enhanced spectral efficiency. Compared to other dynamic spectrum access (DSA) techniques, this method exhibits a faster convergence rate and superior reward performance. Ref. ^[29][106], on the other hand, studies the dynamic spectrum sensing and aggregation problem in a multi-channel application scenario. The dynamic spectrum environment is modeled as a joint Markov chain by feeding the observed channel state values into a deep Q-network to approximate the action-value function. They addressed the dynamic spectrum sensing and aggregation problem by formulating it as a POMDP and proposing a DQN framework. Their simulation results show that DQN achieves near-optimal decision accuracy in most scenarios, even without a priori knowledge of the system dynamics. Moreover, DQN has the lowest computational complexity and has no effect on problem scaling. Unlike the two studies mentioned above, Syed Qaisar Jalil et al. proposed a conservative Q-learning implementation of local cooperative spectrum sensing ^[30][107]. This approach is able to learn complex data distributions more efficiently in offline DRL. The fusion center receives local sensing information from the SU to generate global decisions. They validated the proposed method through simulation experiments, demonstrating its comparable accuracy to other CSS methods and its performance improvement while reducing computation time. Scholars such as Peixiang Cai and others, on the other hand, focus on the problem of selecting appropriate neighbor nodes in CSS under correlated fading. They introduced the coordinated graph method (CG) in ^[31][108] to decompose the global reward into the sum of local terms, thus transforming the CSS problem into a max-plus problem that can be solved by message passing and finally combined with the QN algorithm to accelerate the convergence speed. Their experimental results show that both the flat false alarm probability and the missed detection probability of the proposed model decrease as the number of SUs increases for the same number of maximum neighbor nodes. In a real-world cooperative spectrum-sensing scenario, the data fusion center may suffer from security attacks, such as the presence of malicious nodes that can compromise the performance of the cooperative spectrum-sensing system or the spectrum allocation. The following types of security issues are commonly found in CSS: false data injection attacks, identity spoofing attacks, and data tampering attacks. In view of these security issues in CSS, scholars have started to study related countermeasures. Addressing the potential danger of malicious SUs sending forged data to data fusion centers to jam CSS systems, Anal Paul et al. ^[32][109] tried to use agent intelligence in DRL to be able to effectively avoid the fake data sent to the FC. The proposed model is also based on a deep Q-network, which is made robust to fake data attacks by the experience replay (ER) algorithm. Simulation experiments show that the proposed method has a 26.58 percent higher probability of PU detection than the existing CSS method. However, there are various types of attacks on the system, and related research will continue in the future. Multi-agent deep reinforcement learning (MADRL) is a method that combines deep reinforcement learning with multi-agent systems. Due to its excellent cooperative decision-making and distributed learning capabilities, several scholars have applied it to CSS. Yu Zhang et al. proposed ^[33][110] realizing cooperative spectrum sensing using DQN-based deep reinforcement learning with multiple agents. The rewarding performance is obtained by a variant algorithm, UCB-H, based on the upper confidence bound (UCB) algorithm with faster search. Their experiments compare the performance of traditional algorithms based on Q-learning and ɛ-greedy, and numerical results show that the convergence speed of the model is better than the former under the same conditions. ResWearchers can gain a better understanding of this topic by studying and synthesizing recent research by scholars in this direction. ResearchersWe summarize the above work in Table 1. It is common for researchers to employ deep Q-networks or a combination of Q-learning for cooperative spectrum sensing, where a system can make decisions more accurately in a dynamic environment by combining some corresponding algorithms. The scope of spectrum sensing has been slightly reduced due to various malicious activities or unidentified obstructions. Despite the need to account for model complexity, the need for training data, and the need for computational resources in practical scenarios, deep learning offers a promising avenue for cooperative spectrum sensing that can enhance system performance in multiple ways. More research is needed in this area to provide a boost for practical applications.

Table 1. A review of cooperative spectrum sensing based on deep neural networks and deep reinforcement learning.

Author	Year	Module	Performance	Application Scenario
W. Lee et al. ^[18][95]	2019	CNN美国有线电视新闻网-DCS	DCS (（HD): ）：Pd = 95% (（Pf = 0.5); DCS (SD): Pd）;DCS （SD）：钯 = 95.2% (Pf（钯 = 0.5)）	Under harsh sensing conditions, 在恶劣的传感条件下，CSS with correlated individual spectrum sensing.具有相关的单个频谱传感。
Chen Z et al. ^[19]陈志等[96]	2020	CSS-CNN的	Pd = 90% (（−16 dB, 20 SUs, ，20 SU，Pf = 0.01)）	Distributed secondary users accept perceptual samples in severe channel fading and shadowing environments.分布式次级用户在严重的信道衰落和阴影环境中接受感知样本。
P. Shachi et等人 al. ^[20][97]	2020	美国有线电视新闻网（CNN）	Test accuracy: 测试准确率：98.34% (scenario 3); 100% (scenario 2)（场景3）;100%（方案 2）	Spectrum sensing performance analysis in dynamic scenarios with different noise floors and spaces.不同本底噪声和空间动态场景下的频谱传感性能分析。
Hang Liu et al. ^[21]刘航等[98]	2019	EL+semi-soft FC半软燃料电池	Pd = 96% (（−18 dB, 60 SUs)，60 SU）	CSS for cognitive radio systems under 基于OFDM-based signal.信号下的认知无线电系统的CSS。
Myke D. M. Valadão et等人 al. ^[22][99]	2022	ResNet的	Accuracy精度 = 92% (NSD: （NSD：−114 dBm 、−174 dBm, 5 SUs)、5 个 SU）	Dynamic displacement of the user within a given area over a time period.用户在一段时间内在给定区域内的动态位移。
Raghunatha Rao D et等人 al. ^[23][100]	2022	Deep ResNet data-cleansing algorithm数据清洗算法	Pd = 95.78%（矩阵大小为 (the matrix size of 10 × 5, ，Rician channel)通道）	Sensing通过 spectrum availability with crowd sensors via DNR.DNR 使用人群传感器感知频谱可用性。
Dimpal Janu et al. ^[25]等人[102]	2023	GCN-CSS的	Pd = 100% (（−8 dB, ，Pf = 0.1)）	Dynamics of the wireless environment in CR networks.网络中无线环境的动态。
Shuai Liu et al. ^[27]刘帅等[104]	2021	DDQN的	Average平均累积碰撞率 cumulative collision rate = 0.06. Average cumulative reward 。平均累积奖励 = 0.91	Dynamic多个 cooperative spectrum-sensing environments where multiple PUs or multiple SUs can encounter conflicts.PU 或多个 SU 可能遇到冲突的动态协作频谱感知环境。
Yunzeng Li et al. ^[29]李云增等[106]	2020	DQN型	Modified修改后的决策准确率 decision accuracy = 100% (index of system scenarios = 2)（系统场景指数 = 2）	Dynamic spectrum sensing in wireless networks containing 包含N correlated channels.个相关信道的无线网络中的动态频谱传感。
Jalil S Q et等人 al. ^[30][107]	2021	CQL系列	Detection检测精度 accuracy = 70% (（−14 dB)）	Improved detection accuracy of 提高SUs against PUs and reduced energy consumption.对PUs的检测精度，降低能耗。
Cai P et al. ^[31]蔡萍等[108]	2020	DQN+ coordination graph协调图	AverageCG reward of the SUs in the CG 中 SU 的平均奖励 = 0.29 (number of time slots （时隙数 = 1500)）	物理环境中的某些障碍物允许Certain obstacles in the physical environment allow CR to occur under the associated decay and shadows.R在相关的衰变和阴影下发生。
Anal Paul et等人 al. ^[32][109]	2022	DQL的	Pd钯 = 90% (（−10 dB)）	FC in CSS is subject to data forgery attacks.SS 中的 FC 容易受到数据伪造攻击。
Yu Zhang et al. ^[33]张宇等[110]	2019	DQN+HCB-H	Average reward of all所有代理的平均奖励 agents = 77% (number of time slots = 77%（时隙数 = 1500)）	Each每个 SU gathers information from the environment and other SUs to determine its sensing strategy, which can be structured in two different time slots.从环境和其他 SU 收集信息以确定其感知策略，该策略可以在两个不同的时隙中构建。