1. Introduction
The incessant advancement in communication technology has profoundly impacted various aspects of social life, and the demand for wireless communication continues to escalate. Typically, signals undergo appropriate modulation during transmission, and as the transmission environment grows increasingly complex, multiple modulation types are included within the communication frequency band
[1]. Consequently, it is important to investigate modulation recognition techniques for communication signals in depth. In non-cooperative communication systems
[2][3], modulation recognition primarily serves to process the received signals; analyze the modulation type; and subsequently perform signal demodulation, decoding and other operations to obtain valuable information. In cooperative communication systems, modulation recognition techniques are also applied in numerous fields, including spectrum sensing
[4][5], spectrum resource management
[6], cognitive radio
[7] and others. In summary, to guarantee communication security, relevant departments must reinforce the supervision of communication signals. This requires the effective identification of interference information embedded within signals, and modulation recognition can play a crucial role in achieving the efficient allocation of spectrum resources.
Most current modulation recognition techniques are based on likelihood ratio theory or feature extraction algorithms, which involve intricate steps and exacting conditions. A primary drawback of these approaches is that feature extraction and selection may result in the loss of some signal information. Consequently, neural network-based modulation recognition algorithms have garnered attention, as they can achieve end-to-end recognition without manual feature extraction. This class of algorithms can retain the signal information to the maximum extent and achieve better results. Neural network-based modulation recognition techniques for communication signals are more suitable for the emerging modulation types. However, the deep features extracted by neural networks cannot effectively recognize all modulation types, resulting in confusion among certain modulation types. The existing methods attempt to resolve this issue by increasing the number of network layers, such as implementing deep neural networks such as residual network 50 (ResNet50), to improve the modulation recognition rates. Nonetheless, when the dataset is large and the network has numerous parameters to learn, it takes a long time to train the network, thereby diminishing model efficiency. Furthermore, when the neural network is initialized with random weights and trained several times on the same dataset, the network recognition performance in each training process considerably varies. The modulation recognition rates of the same network trained with signals at high SNRs and low SNRs also exhibit significant disparities.
2. An Improved Modulation Recognition Algorithm Based on Fine-Tuning and Feature Re-Extraction
In communication systems, a baseband signal needs to be modulated for transmission in the channel. With the development of communication technology, there are various modulation types with different characteristics. Modulation recognition is a two-step process: pre-processing the communication signals and using the appropriate classifier to recognize the modulation types
[8]. The modulation recognition algorithms for communication signals can be divided into three categories at present
[9], which are likelihood-based, feature-based and deep learning-based algorithms.
The modulation recognition algorithm based on the likelihood function, which successfully distinguishes between BPSK and QPSK signals, was firstly proposed in
[10]. More specifically, the authors calculated the probability density functions of signal parameters, such as the symbol transmission rate, the SNR and the carrier frequency; obtained the corresponding log-likelihood ratio; and then estimated the modulation order of the signals. However, the derivation process of the likelihood function is computationally complex and requires a priori knowledge about the distribution of statistics
[11]. Moreover, the specific decision criteria for the likelihood ratio are also different for different practical problems, so likelihood-based modulation recognition algorithms are less generalizable. In addition, it is difficult to obtain accurate values of signal parameters at low SNRs, which affects the recognition of the modulation types.
The modulation recognition algorithm based on signal feature extraction
[12][13] consists of the following three steps: Firstly, we should pre-process the modulated signals, mainly including signal down-sampling, digital filtering, etc. Secondly, we can extract the features from different angles to realize effective signal description. Finally, based on the differences among the corresponding signal eigenvalues, we can recognize the modulated signals by setting appropriate thresholds. Zhang et al.
[14] constructed six characteristic parameters based on instantaneous information and signal spectrum. The proposed method correctly classified the modulated signals of two-level amplitude-shift keying (2ASK), four-level amplitude-shift keying (4ASK), two-level frequency-shift keying (2FSK), BPSK, minimum shift keying (MSK), frequency modulation (FM), lower sideband (LSB) and upper sideband (USB) with more than 95% recognition rate at SNR = 6 dB. On the basis of high-order cumulants, combined with peak features of the FFT spectrum and instantaneous signal features, Yang et al.
[15] proposed a new method for digital modulation recognition based on mixed signal features. The new method successfully and efficiently recognized six classical digital modulation types and achieved satisfactory recognition results even at rather low SNRs. By considering the different cumulant combinations of 2FSK, 4FSK, BPSK, QPSK, 2ASK and 4ASK signals, Xie et al.
[16] established new signal parameters to achieve better recognition of these digital modulation types. The overall recognition accuracy was 99% at SNR = −5 dB and 100% at SNR = −2 dB. Wang et al.
[17] used the fourth-order cumulants of four signals (8PSK, 16QAM, PAM4 and BPSK) as the recognition parameters. Under additive white Gaussian noise (AWGN) channels, the recognition accuracy reached more than 90% when the number of symbols was above 250 and SNR > 10 dB. Hassanpour et al.
[18] proposed a wavelet-based algorithm for the recognition of binary digital modulation types, including 2ASK, 2FSK and BPSK, in the presence of AWGN. The average rates of 99.97%, 99.71% and 97.34% were obtained for the recognition of the three modulations at −5 dB, −7 dB and −10 dB. Yang et al.
[19] converted the time-domain diagrams of different complex modulated signals into spectrogram images using the wavelet transform. Then, the authors adopted AlexNet to classify the eight modulated signals of 2ASK, 4ASK, 2PSK, 4PSK, 2FSK, 4FSK, 16QAM and 64QAM. The recognition accuracy of the eight modulation types was almost 100% at higher SNRs. In
[20], a new blind modulation classification (BMC) method was proposed for classifying the three modulated signals of QPSK, offset-QPSK (OQPSK) and π/4-QPSK, based on the second-order and fourth-order cyclic cumulants. The proposed feature-based BMC algorithm added robustness against various impairments and worked well even in the frequency-selective fading channels. Wei et al.
[21] proposed a novel method for the automatic modulation classification (AMC) of digital communication signals using a support vector machine (SVM) based on hybrid features, cyclostationarity and information entropy. Moreover, the authors proposed three new features, which did not require any prior information and had a strong anti-noise ability. Shi
[22] extracted Box fractal dimension, Katz fractal dimension, Higuchi fractal dimension, Petrosian fractal dimension and Sevcik fractal dimension from eight modulated signals. In addition, back-propagation (BP) neural network, gray relation analysis (GRA), random forest (RF) and K-nearest neighbor (KNN) were used to recognize the different modulated signals based on the fractal features. The results indicated that RF had better recognition performance with 96% accuracy at SNR = 10 dB. Wang et al.
[23] proposed a low-complexity graphic constellation projection (GCP) algorithm for AMC, and adopted the deep belief network (DBN) to learn the underlying features in these constellations. The recognition accuracy was beyond 95% at SNR = 0 dB. Yan et al.
[24] presented an innovative AMC method using graph-based constellation analysis for
M-ary QAM signals. The proposed method with lower computational complexity could provide superior performance compared with existing subtractive clustering techniques and was robust to the residual phase and timing offsets. In summary, modulation recognition performance can be improved by extracting features with significant differences among the modulation types from multiple perspectives. Moreover, it is necessary to select an appropriate classifier in order to obtain better recognition performance. The feature-based modulation recognition algorithm is less computationally intensive and simpler to implement than the likelihood-based one, but the recognition performance depends on the number of features and the differences among features. Moreover, it is difficult to accurately extract features in non-ideal channels.
In recent years, with the rapid development of deep learning, researchers have started applying it to signal processing
[25][26][27][28][29][30]. The main innovation point of deep learning-based methods is that the novel network architectures with tens or even hundreds of layers and network training methods are allowed to be used for recognition. On the one hand, the deep learning-based modulation recognition algorithm can extract artificial features from the original signals and then utilize the extracted features as the inputs of neural networks. Lee et al.
[31] proposed an enhanced blind modulation classification (BMC) method based on deep neural network (DNN) for fading channels. Then, the authors adopted DNN to recognize 16QAM, 64QAM, BPSK, QPSK and 8PSK based on 28 signal features. The experimental results showed that the recognition rate was enhanced with the increase in the number of signal features. Kim et al.
[32] adopted deep connected neural network (DCNN) with artificial features as the network inputs to successfully recognize PSK and QAM signals with different orders. The authors discussed the interference of Gaussian white noise and Doppler frequency shift with the network recognition performance and confirmed that DCNN had stronger generalization ability and signal recognition ability. Mendis et al.
[33] proposed an automatic modulation classification (AMC) method based on a spectral correlation function (SCF) pattern. The authors used DBN to abstract the complex signal features that were represented by the associated SCF patterns and then distinguished among five kinds of digitally modulated signals using the features. The proposed method had low sensitivity to Gaussian white noise channels. In addition, the recognition accuracy was greatly reduced in the AWGN environment. To solve the problem, a multi-carrier recognition system based on CNN and principal component analysis (PCA) was proposed in
[34]. The PCA-based processing method could suppress AWGN and reduce the dimension of the network inputs. The system correctly identified three kinds of multi-carrier waveforms in a dense transmission environment and achieved good recognition results even at low SNRs. Gou et al.
[35] proposed a semi-supervised learning method based on data-driven models that combined contrastive predictive coding with an unsupervised pre-training algorithm, as well as a supervised learning algorithm. The authors constructed a joint DNN composed by long short-term memory (LSTM) and ResNet50 and then extracted the instantaneous features using the Hilbert transform as the network inputs to recognize 11 modulation types. The semi-supervised joint neural network structure improved the recognition accuracy by 3∼20% compared with the previous methods and reached an average recognition accuracy of 94% at SNR levels ranging from 0 dB to 18 dB.
On the other hand, deep learning-based recognition algorithms can directly utilize the original signals as the network inputs and realize end-to-end recognition. This class of algorithms have strong generalization ability and robustness for various modulation recognition tasks. O’Shea et al.
[36] developed a new end-to-end modulation recognition algorithm based on deep residual network (DRN). The proposed algorithm was feasible in realistic communication environments and achieved higher recognition accuracy at low SNRs than the other methods mentioned in the paper. Zhang et al.
[37] used DBN and temporal in-phase and quadrature (IQ) data representation to identify 11 modulation types. The method obtained high recognition accuracy at high SNRs. Vanhoy et al.
[38] proposed a branch convolutional neural network (B-CNN) to recognize more than 20 modulated signals. Xu et al.
[39] proposed an effective multi-stream network structure, namely, multi-channel convolutional long short-term deep neural network (MCLDNN). The network structure utilized the information of I-channel data, Q-channel data and I/Q-multi-channel data of the original signals and integrated one-dimensional (1D) convolutional, two-dimensional (2D) convolutional and LSTM layers to extract spatio-temporal features. MCLDNN performed significantly better than other network structures above −4 dB SNR and reached an average recognition accuracy of 92% at SNR levels ranging from 0 dB to 18 dB, an improvement of 2∼10% over the others.
In practical scenarios, it is difficult to construct large-scale well-annotated datasets for all domains of interest, and the recognition model performs weakly in the domain with insufficient data. To address this problem, Bu et al.
[40] proposed an adversarial transfer learning architecture (ATLA), incorporating adversarial training and knowledge transfer in a unified way. The proposed ATLA substantially boosted the performance of the target model. More specifically, the target model achieved the recognition accuracy of 82% with half of the training data reduced, and the accuracy was increased by 17.3% with respect to that of supervised learning with one-tenth of training data. In addition, there are generally few labeled samples and large unlabeled samples in realistic communication scenarios. It is almost impossible to implement previously proposed deep learning-based AMC algorithms in this case. Wang et al.
[41] proposed a TL-based semi-supervised AMC (TL-AMC) method in a zero-forcing-aided multiple-input and multiple-output (ZF-MIMO) system. TL-AMC performed better than CNN-based AMC with the limited samples, and TL-AMC also achieved recognition accuracy at high SNRs similar to that of CNN-based AMC trained on massive labeled samples. Most of existing AMC methods have been designed under the assumption that the classifier has prior knowledge of the signal and channel parameters. Perenda et al.
[42] proposed two possible directions to make AMC more robust to signal shape transformations introduced by unknown signal and channel parameters. Spatial transformer networks (STNs) and TL were embedded into a light ResNeXt (ResNet next dimension)-based classifier. This proposed method improved the average recognition accuracy up to 10∼30% in specific unseen scenarios, with only 5% of labeled data for a large dataset of 20 complex higher-order modulation types. Finally,
Table 1 presents the summary of the above-mentioned deep learning-based modulation recognition algorithms and compares deep learning-based algorithms and the proposed algorithm in terms of advantages, limitations and recognition accuracy.
Table 1. Comparison of deep learning-based algorithms and the proposed algorithm.
With the rapid development of communication technology, the demand for automatic modulation recognition (AMR) in signal processing scenarios has become increasingly urgent.