1. Introduction
With the continued increase in demand from global mobile users for information exchange, wireless communication is developing rapidly, in which the large-scale commercial development of the fifth-generation (the 5th generation, 5G) mobile communication is in full swing. The sixth-generation (the 6th generation, 6G) mobile communication technology is beginning to be researched and developed. The birth of 5G not only offers human-centered communication, which provides users with a significantly better communication experience, but also gives rise to many new internet industries such as cloud gaming, virtual reality, etc.,
[1], which range from telephony, short message service (SMS), images, videos, audio and internet of things (IOT) data.
Nevertheless, along with the rapid development of technology and services and their large-scale popularization, the number of cell phone users has also increased dramatically. The available electromagnetic spectrum, as a crucial and limited resource for wireless communication, is becoming scarce, and the expansion of wireless communication services means the growing demand for electromagnetic spectrum bandwidth
[2]. The available spectrum resources planned by the ITU (International Telecommunication Union) are from 9 kHz to 275 GHz, and according to the ITU’s 2006 forecast report on spectrum resources, it is expected that the global demand for the spectrum will be from 1280 to 1720 MHz in 2020. However, with the development of mobile communication services and the increase in user data traffic, the originally predicted spectrum can no longer meet the future demand, and now it can only meet about half of the demand
[3]. To summarize, the shortage in electromagnetic spectrum resources, as a key problem of wireless mobile communication nowadays, will restrict the healthy development of wireless communication and become a bottleneck for developing wireless communication technology in the future if not solved effectively.
At present, spectrum resources are managed and allocated by governments worldwide, such as in the United States, where spectrum management and allocation are under the Federal Communications Commission, and in China, where the National Radio Administration manages the spectrum. Moreover, the allocation method is generally “static allocation”. The available spectrum is divided into multiple non-overlapping parts allocated to different users, the users of different frequency bands are called authorized users, and the frequency band is called the authorized frequency band. Although countries have made a rational allocation of the spectrum, after decades of development, it can be pointed out that the utilization rate of each frequency band at different times and in different geographical locations is disappointing. A study by the Federal Communications Commission of the United States found that, among the licensed frequency bands available and in use, spectrum utilization, especially in the hundreds of megabits to 3 GHz band, where the demand for frequencies is very tight, is very low, with the utilization rate of the licensed frequency bands fluctuating between 15 and 85 percent. Only a few frequency bands have high utilization rates, and the vast majority of frequency bands only have low utilization rates. The utilization rate in some regions is only 5%, so the utilization of the spectrum is extremely unbalanced. At present, the global spectrum resources below 1 GHz have been distributed to very few, and the remaining available frequency bands for wireless communication are also quite limited and cannot satisfy the expansion of new services. How to realize the dynamic management of the spectrum and spectrum sharing mechanisms to improve the efficiency of spectrum utilization in each period and each region has become an urgent problem to be solved
[4].
Cognitive radio (CR) is a technology that can improve spectrum utilization, and efficient and accurate spectrum detection is the key to its implementation. The concept of cognitive radio was proposed by Dr. Joseph Mitola, the “father of software radio”, in his doctoral dissertation in 1999
[5]. He pointed out that the key to cognitive radio lies in the system’s ability to comprehensively recognize, analyze, learn, and judge all kinds of information in the external radio environment. The system can communicate intelligently with other cognitive devices to realize a dynamic spectrum allocation policy to improve the efficiency of the frequency spectrum and achieve reliable communication. The cognitive radio system consists of four main steps, i.e., spectrum sensing, spectrum analysis, spectrum judgment, and spectrum reconstruction. The working mode of the cognitive radio system is shown in
Figure 1 below.
Figure 1. The working mode of cognitive radio system.
Among the four steps, spectrum sensing (SS) is one of the essential components of cognitive radio, which is the core technology and prerequisite for realizing CR applications and constructing cognitive radio networks
[6]. Spectrum-sensing technology enhances spectrum use efficiency by continuously monitoring and analyzing the radio spectrum. It promptly identifies and utilizes unoccupied frequency bands when pre-allocated or statically assigned frequencies are unavailable, meeting the demand for spectrum resources. Radio spectrum sensing has a wide range of promising applications and has been the subject of significant research efforts. In the future, it is foreseeable that this technology will be widely used in communications, radar, unmanned aerial vehicles, intelligent transportation, and other fields. In the field of communication, spectrum sensing can be used for spectrum management and resource allocation in communication systems. SS can monitor the utilization of the radio spectrum to detect which primary user bands are free, for better spectrum resource allocation and dynamic spectrum sharing. In the field of radar, spectrum-sensing techniques can monitor and analyze the use of the radar spectrum to achieve optimal utilization and management of the spectrum for radar defense against jamming
[7]. In the field of unmanned aerial vehicles (UAVs), spectrum sensing can provide spectrum information for UAV communication and navigation. By sensing the surrounding spectral environment and information in real time, UAVs can select the optimal communication bands and avoid interfering sources. In addition, spectrum sensing can also help UAVs perform environment sensing and obstacle avoidance, improving safety and intelligence
[8]. In the field of smart transportation, this technology can monitor the spectrum used by various wireless devices and communication systems in transportation networks and perform dynamic spectrum management to ensure communication quality and reliability for various applications
[9]. As can be seen, the development of spectrum-sensing technology is of great strategic importance.
After decades of development, cognitive radio technology and its spectrum-sensing methods can be broadly categorized into two types: single-node spectrum sensing and cooperative spectrum sensing
[10]. Single-node spectrum sensing involves independent judgment by a single user, which does not require a complex system structure or data fusion. It provides a relatively simple sensing process but also faces challenges in breaking through the limitations imposed by physical constraints. On the other hand, cooperative spectrum sensing treats multiple sub-users as a group of cooperative entities that share sensing information. This approach aims to achieve more accurate frequency usage detection for primary users (PUs). However, it still encounters several practical challenges, including coordination, energy consumption, latency, security, and participation issues.
In recent years, artificial intelligence has emerged as a popular technology worldwide, and the development of deep learning has introduced new ideas and methods for spectrum-sensing research. Deep learning excels in nonlinear modeling and adaptive learning, significantly boosting spectrum sensing’s detection performance and speed. This, in turn, facilitates more efficient utilization of spectrum resources
[11]. Over the past few years, researchers in the field of communications have employed several prominent network models for spectrum sensing, including convolutional neural networks and long short-term memory networks
[12][13]. Several published papers have demonstrated the superiority of deep-learning-based spectrum-sensing algorithms compared to traditional approaches. Consequently, numerous researchers have investigated the combination of deep learning neural networks with spectrum-sensing research. For instance, in
[14], scholars Kaixuan Du et al. considered the geodetic distances between signals as statistical features and employed deep neural networks (DNNs) for classifying the data based on these distances, achieving spectrum sensing. Additionally, Jianxin Gai et al. proposed a spectrum-sensing method based on residual networks (ResNets) in
[15]. This method employed two-branch convolution to improve the feature extraction capabilities, resulting in superior performance compared to traditional spectrum-sensing methods. The enhanced feature extraction capabilities led to higher detection probabilities and lower bit error rates (BERs) than the traditional methods, especially at low SNR ratios.
Currently, several institutions, including the University of Electronic Science and Technology, Harbin Institute of Technology, Beijing University of Posts and Telecommunications, Florida State University, and the University of Surrey, have been actively researching deep learning in the field of spectrum sensing for cognitive radio, and have made significant progress. This research aims to provide a summary of the current research status, application scenarios, and future development directions of deep-learning-based spectrum-sensing technology.
However, it is crucial to note that although the spectrum-sensing approach based on classical deep learning (DL) networks has shown promising performance, it also has certain limitations. Most of the current research focuses on the combination of deep learning and non-cooperative spectrum sensing. There are drawbacks to this approach, however. First, in non-cooperative spectrum sensing, the availability of large-scale labeled data can be limited as each device typically only has its own sensing results, leading to a lack of data sharing and collaboration. Second, in non-cooperative spectrum sensing, individual devices sense and process independently, which may not meet the computational requirements of deep learning models due to the computational power and energy constraints of the devices.
There has been a growing interest among scholars to explore the application of deep learning to cooperative sensing. Cooperative sensing allows the aggregation of data from multiple devices, resulting in a larger and more diverse dataset that can be used to train deep learning models. This approach offers several advantages. First, cooperative sensing provides real-time spectrum information and environmental change data, enabling deep learning models to adapt and update rapidly to dynamic conditions. Second, it enhances the availability of observational data and multi-source information, which improves the ability of deep learning models to accurately sense and identify interfering sources.
Researchers have started to investigate deep-learning-based cooperative sensing algorithms. For instance, Tan et al.
[16] constructed a new 2D dataset of received signals and trained three cooperative spectrum sensing (CSS) schemes using classical convolutional neural networks (CNNs) such as LeNet, AlexNet, and VGG-16. They compared the performance of CSS schemes based on AND, OR, and majority judgment, with the former achieving favorable results. On the other hand, Chen et al.
[17] proposed a DNN-based CSS algorithm and introduced the federated learning framework (FLF) into CSS. Their results demonstrated a detection probability of 98.78% and a false detection probability of 1% at an SNR of −15 dB.
2. Deep-Learning-Based Cooperative Spectrum Sensing
All the previously introduced approaches are designed for the study and application of spectrum sensing in the context of non-cooperative spectrum sensing, whereas in real-world application scenarios, there are, in general, multiple PUs or multiple SUs communicating with each other, which can have a lot of complexity and conflict possibilities.
Nowadays, there are several modes and applications of cooperative spectrum sensing, which have been introduced earlier. The findings indicate that the hidden node problem has emerged as a prominent problem in cooperative spectrum sensing. This is also true in real-world communication environments, where shadowing and multipath effects may affect the accuracy of sensing results. How to efficiently and rationally allocate resources to each sensing node is also an urgent issue to be addressed.
Recently, scholars have started to introduce deep learning and deep reinforcement learning into the field of cooperative spectrum sensing to alleviate or solve the aforementioned problems. For example, deep learning can optimize data fusion by automatically extracting features from the data of multiple sensing nodes, enabling the system to make more accurate judgments. Another example is that, with proper training, deep learning models can better handle and predict uncertainty in dynamic environments. Deep reinforcement learning, on the other hand, can recognize potential security threats or learn to make effective and robust decisions in CSS security scenarios through continual learning and adaptation.
2.1. Applications of Deep Neural Networks in Cooperative Spectrum Sensing
Understanding the pertinent literature from recent years has revealed that ResNet, graph neural networks, and deep neural networks are the most frequently used CNNs in cooperative spectrum sensing. The next section will summarize the research findings of academics in cooperative spectrum sensing from recent years.
In 2019, Woongsup Lee et al. considered the case where multiple SUs share sensing information to make state judgments on whether a single PU occupies more than one frequency band. They proposed a CNN-based CSS model in
[18] to improve the sensing performance and stability by considering the state and spectral correlation of the channel and extracting the data space features to be input into the optimized modest-sized CNN. The simulation results show that the method outperforms the traditional CSS methods such as K-out-of-N and SVM. The paper is also the first to apply deep neural networks to CSS.
In Ref.
[19], Zhibo Chen et al. applied it to distributed cooperative spectrum sensing by using the sample covariance matrix as a test statistic as an input to the CNN, i.e., collecting sensory information from different SUs for learning. The proposed model is CSS-CNN, and the experiments are performed under channel fading to compare with other CNN and CLDNN methods, which perform adequately at low SNRs. Whereas in
[20], P. Shachi et al. used a CNN in a centralized cooperative spectrum-sensing framework and considered spatio-temporal data, channel shadowing, and spatial correlation, and the CNN model was also trained for decision making through data fusion centers. The results show that the CNN-based framework maintains strong sensing accuracy under noisy conditions. Although the performance and complexity of CNN can be affected by noise, the results are still accurate due to the decision inputs from historical training and local user measurements.
Hang Liu
[21] focused his research on the stacking approach of fusion centers in CSS and perceived OFDM signals in CSS. Recurrent correlated feature learning is fed to a CNN, and a training database is built through a bagging strategy. The innovation is that the data fusion center with the stacked generalization approach can better learn the probabilistic prediction of PU states. The experimental results show the advantage over the traditional CSS method by showing both the detection probability and the false alarm probability.
The application of ResNet to cooperative spectrum sensing is also currently in progress. The authors Myke D. M. Valadão et al. consider experiments with neural network models placed in a CSS environment in a scenario where the user maintains a dynamic movement within that range. In Ref.
[22], they proposed a cooperative spectrum-sensing approach based on ResNet. The 3D matrix is used as the input to ResNet, conv3D is used for feature extraction, and two additional conventional CNN and RNN models are used for experimental comparison. The experimental results show that the nn-ResNet has a higher accuracy of SU detection than the other two deep neural networks at noise power densities ranging from −140 to −120, and with an increase in the number of SUs, the accuracy of this model is also better than the CNN and RNN, with a detection accuracy of 97.1% for an SU number of 20.
In ICECAA 2020, D Raghunatha Rao et al.
[23] combined deep residual network and data-cleansing algorithms to design DRN test statistic refinement for a crowd sensor CSS. That is, the fused data in the CSS system is fed into the ResNet as a matrix together with the sensing information, and the data-cleansing algorithm senses the spectral availability of crowd sensors along the data fusion center to complete the statistical refinement of the DRN test. The simulation results show that the proposed model achieves better detection probability than data-cleaning algorithms and hypothesis-thresholding methods.
Graph neural networks (GNNs), deep learning models capable of learning and reasoning about graph-structured data
[24], can help sensing devices model node characteristics and support information dissemination and cooperative decision making among nodes in cooperative spectrum sensing due to their ability to represent nodes and edges as vectors or matrices. These advantages provide a new approach and tool for performance improvement in cooperative spectrum sensing. Scholars are beginning to apply GNNs to cooperative spectrum sensing.
In 2023, in order to adapt to the dynamically changing radio environment and the problem of modeling hidden node scenes, Dimpal Janu et al. proposed a model of GCN-CSS for modeling dynamic real-world scenes under multiple antennas
[25]. The GCN-CSS method has a lower computational complexity than the CNN method and still improves perceptual performance even with imperfect reporting channels. It is found that the proposed scheme provides the best performance improvement over different algorithms when the number of hidden nodes increases, and the proposed method outperforms traditional algorithms such as CNN, ANN, SVM, and K-means clustering in terms of perceptual performance. The results show that GCN-CSS has significant advantages in complex wireless environments that require dynamic adaptation and strong performance.
2.2. Applications of Deep Reinforcement Learning to Cooperative Spectrum Sensing
Deep reinforcement learning (DRL) is an artificial intelligence method that combines the advantages of deep learning and reinforcement learning
[26]. Its decision making and learning in a dynamic environment fit the needs of CSS. As a result, scholars are now beginning to introduce DRL into CSS to address distributed decision making, synchronization problems, and security and robustness issues in CSS.
In Ref.
[27], Shuai Liu et al. considered cooperative spectrum sensing in multi-user scenarios and dynamic environments. Deep multi-user reinforcement learning (DMRL) with deep Q-network
[28] as the underlying framework is proposed, which is capable of determining the optimal action for each state in a relatively large, dynamic, unknown environment. After training the DDQN by training all users individually, each SU performs conflict state determination via acknowledge (ACK) signaling. In simulation experiments, the proposed method demonstrated enhanced spectral efficiency. Compared to other dynamic spectrum access (DSA) techniques, this method exhibits a faster convergence rate and superior reward performance.
Ref.
[29], on the other hand, studies the dynamic spectrum sensing and aggregation problem in a multi-channel application scenario. The dynamic spectrum environment is modeled as a joint Markov chain by feeding the observed channel state values into a deep Q-network to approximate the action-value function. They addressed the dynamic spectrum sensing and aggregation problem by formulating it as a POMDP and proposing a DQN framework. Their simulation results show that DQN achieves near-optimal decision accuracy in most scenarios, even without a priori knowledge of the system dynamics. Moreover, DQN has the lowest computational complexity and has no effect on problem scaling. Unlike the two studies mentioned above, Syed Qaisar Jalil et al. proposed a conservative Q-learning implementation of local cooperative spectrum sensing
[30]. This approach is able to learn complex data distributions more efficiently in offline DRL. The fusion center receives local sensing information from the SU to generate global decisions. They validated the proposed method through simulation experiments, demonstrating its comparable accuracy to other CSS methods and its performance improvement while reducing computation time.
Scholars such as Peixiang Cai and others, on the other hand, focus on the problem of selecting appropriate neighbor nodes in CSS under correlated fading. They introduced the coordinated graph method (CG) in
[31] to decompose the global reward into the sum of local terms, thus transforming the CSS problem into a max-plus problem that can be solved by message passing and finally combined with the QN algorithm to accelerate the convergence speed. Their experimental results show that both the flat false alarm probability and the missed detection probability of the proposed model decrease as the number of SUs increases for the same number of maximum neighbor nodes.
In a real-world cooperative spectrum-sensing scenario, the data fusion center may suffer from security attacks, such as the presence of malicious nodes that can compromise the performance of the cooperative spectrum-sensing system or the spectrum allocation. The following types of security issues are commonly found in CSS: false data injection attacks, identity spoofing attacks, and data tampering attacks. In view of these security issues in CSS, scholars have started to study related countermeasures.
Addressing the potential danger of malicious SUs sending forged data to data fusion centers to jam CSS systems, Anal Paul et al.
[32] tried to use agent intelligence in DRL to be able to effectively avoid the fake data sent to the FC. The proposed model is also based on a deep Q-network, which is made robust to fake data attacks by the experience replay (ER) algorithm. Simulation experiments show that the proposed method has a 26.58 percent higher probability of PU detection than the existing CSS method. However, there are various types of attacks on the system, and related research will continue in the future.
Multi-agent deep reinforcement learning (MADRL) is a method that combines deep reinforcement learning with multi-agent systems. Due to its excellent cooperative decision-making and distributed learning capabilities, several scholars have applied it to CSS.
Yu Zhang et al. proposed
[33] realizing cooperative spectrum sensing using DQN-based deep reinforcement learning with multiple agents. The rewarding performance is obtained by a variant algorithm, UCB-H, based on the upper confidence bound (UCB) algorithm with faster search. Their experiments compare the performance of traditional algorithms based on Q-learning and
ɛ-greedy, and numerical results show that the convergence speed of the model is better than the former under the same conditions.
Researchers can gain a better understanding of this topic by studying and synthesizing recent research by scholars in this direction. Researchers summarize the above work in Table 1. It is common for researchers to employ deep Q-networks or a combination of Q-learning for cooperative spectrum sensing, where a system can make decisions more accurately in a dynamic environment by combining some corresponding algorithms. The scope of spectrum sensing has been slightly reduced due to various malicious activities or unidentified obstructions. Despite the need to account for model complexity, the need for training data, and the need for computational resources in practical scenarios, deep learning offers a promising avenue for cooperative spectrum sensing that can enhance system performance in multiple ways. More research is needed in this area to provide a boost for practical applications.
Table 1. A review of cooperative spectrum sensing based on deep neural networks and deep reinforcement learning.
Author |
Year |
Module |
Performance |
Application Scenario |
W. Lee et al. [18] |
2019 |
CNN-DCS |
DCS (HD): Pd = 95% (Pf = 0.5); DCS (SD): Pd = 95.2% (Pf = 0.5) |
Under harsh sensing conditions, CSS with correlated individual spectrum sensing. |
Chen Z et al. [19] |
2020 |
CSS-CNN |
Pd = 90% (−16 dB, 20 SUs, Pf = 0.01) |
Distributed secondary users accept perceptual samples in severe channel fading and shadowing environments. |
P. Shachi et al. [20] |
2020 |
CNN |
Test accuracy: 98.34% (scenario 3); 100% (scenario 2) |
Spectrum sensing performance analysis in dynamic scenarios with different noise floors and spaces. |
Hang Liu et al. [21] |
2019 |
EL+semi-soft FC |
Pd = 96% (−18 dB, 60 SUs) |
CSS for cognitive radio systems under OFDM-based signal. |
Myke D. M. Valadão et al. [22] |
2022 |
ResNet |
Accuracy = 92% (NSD: −114 dBm −174 dBm, 5 SUs) |
Dynamic displacement of the user within a given area over a time period. |
Raghunatha Rao D et al. [23] |
2022 |
Deep ResNet data-cleansing algorithm |
Pd = 95.78% (the matrix size of 10 × 5, Rician channel) |
Sensing spectrum availability with crowd sensors via DNR. |
Dimpal Janu et al. [25] |
2023 |
GCN-CSS |
Pd = 100% (−8 dB, Pf = 0.1) |
Dynamics of the wireless environment in CR networks. |
Shuai Liu et al. [27] |
2021 |
DDQN |
Average cumulative collision rate = 0.06. Average cumulative reward = 0.91 |
Dynamic cooperative spectrum-sensing environments where multiple PUs or multiple SUs can encounter conflicts. |
Yunzeng Li et al. [29] |
2020 |
DQN |
Modified decision accuracy = 100% (index of system scenarios = 2) |
Dynamic spectrum sensing in wireless networks containing N correlated channels. |
Jalil S Q et al. [30] |
2021 |
CQL |
Detection accuracy = 70% (−14 dB) |
Improved detection accuracy of SUs against PUs and reduced energy consumption. |
Cai P et al. [31] |
2020 |
DQN+ coordination graph |
Average reward of the SUs in the CG = 0.29 (number of time slots = 1500) |
Certain obstacles in the physical environment allow CR to occur under the associated decay and shadows. |
Anal Paul et al. [32] |
2022 |
DQL |
Pd = 90% (−10 dB) |
FC in CSS is subject to data forgery attacks. |
Yu Zhang et al. [33] |
2019 |
DQN+HCB-H |
Average reward of all agents = 77% (number of time slots = 1500) |
Each SU gathers information from the environment and other SUs to determine its sensing strategy, which can be structured in two different time slots. |