1 Introduction
For intelligence systems assigned for keeping an eye on the communications spectrum, communication signals which are distinguished by a wide range of modulations to achieve high data velocity while decreasing interference present an extremely difficult task. The identification and demodulation processes become increasingly complex as modulations become more complex, particularly when trying to extract pertinent data in the arena of communications intelligence (COMINT).
Within the COMINT area, the study concentrates on the complicated challenge of recognizing and classifying modulations in intercepted signals, where the main goal is to extract pertinent data. This is essential for finding out the kind of transmission thereby making the demodulation process easier. Decoding communication signals, whether voice or data, is the focus of Communications Intelligence (COMINT), as opposed to Electronic Intelligence (ELINT), which mostly works with radars. Modulation is the mechanism allowing to transfer the information via communication signals by applying changes over the amplitude, frequency or phase of the signal. To overcome the difficulties caused by air attenuation and enable high-speed transmission, the information is modulated into a specific frequency. The Intelligence procedure, which involves frequency and signal level measurements, is started by intercepting an unknown signal within the enormous communications spectrum. Determining the modulation utilized to transmit the radio signal, however, is the first and most important challenge. The intelligence approaches used successively different demodulators, which has been shown to be a slow and inefficient process, especially when dealing with contemporary modulations. This environment has changed dramatically with the introduction of artificial intelligence (AI). In the field of wireless research, automatic classification of modulation types at the receiver has attracted a lot of attention due to its ability to improve spectrum usage efficiency. The initial attempts utilized Convolutional Neural Network (CNN) architectures and applied spectrogram images generated via different modulations. Using the Inphase and Quadrature signals (I/Q) of the signal that represent idealy the content of a signal. When compared to conventional methods, I/Q data has shown better performance in automatic modulation recognition. Any signal is really made up of two parts: The quadrature component (Sinus) and the in-phase component (Cosine). The waveforms and represent the real and imaginary components of the complex baseband signal that these I/Q samples depict. The actual content of the signal is stored into a matrix with two rows that represent and . This is the whole signal description, which can be stated as .
This methodology for classifying modulations, which is based on I/Q, is powerful and offers a thorough way to interpret the complex modulations seen in contemporary communication signals.
2 Background
The term Automatic Modulation Classification (AMC) refers to a range of algorithms that are generally classified as classic approaches. These approaches primarily fall into two categories: Likelihood-based (LB) and feature-based (FB) approaches, as well as sophisticated techniques that utilize deep learning.
2.1 Common Approaches
2.1.1 Likelihood-Based Techniques
Likelihood-based methods were prevalent in the early phases of Automatic Modulation Classification (AMC). These methods involve precise probability function computation for several types of modulation. The basic concept is to identify the most likely modulation type by comparing the received signal to a set of predefined likelihood functions. Computational models and probability theories are used by likelihood-based techniques to address modulation identification difficulties in scenarios with both known and unknown channel information
[1]. While these approaches can achieve optimal classification accuracy provided perfect information of the signal and channel models is assumed, calculating model parameters necessitates significant computational complexity
[2, 3].
2.1.2 Feature-Based Approaches
Within AMC, feature-based methods provide a basic method for differentiating modulation patterns. This approach, which offers a sensible compromise between computational economy and classification accuracy, is based on feature extraction and classifier construction. The underlying idea is to extract the unique properties of different signals without having to mathematically deduce the signal's likelihood function. Pre-processing the signal and extracting pertinent features are the two crucial phases in the feature-based approach. A classifier is then used to classify the signal according to these features. The meticulous selection of signal properties and the development of robust classifiers are critical components in the success of this technique. Feature-based approaches are very useful when algorithm complexity must be kept to a minimum, which makes them appropriate for real-time applications and resource-constrained environments
[4].
Although feature-based methods exhibit adaptability to various channel models, they encounter significant limitations, including the weak discriminatory capability of manually crafted features and the constrained learning capacity of conventional classification algorithms
[5–7].
2.2 Advanced Approaches
The remarkable data processing powers of deep learning (DL) have attracted a lot of attention and have been used in many different fields due to the quick development of Artificial Intelligence (AI) technology, which includes radio signal processing for communications. Research on the application of deep learning for AMC is ongoing, with new methods and architectures being put forth to increase classification accuracy and lower computing complexity. In fact, using DL to solve traditional feature-based signal classification problems offers a productive and affordable substitute for AMC. To overcome the present drawbacks of conventional methods, a number of novel AMC techniques utilizing deep networks, such as deep neural networks (DNNs), convolutional neural networks (CNNs)
[8], recurrent neural networks (RNNs), and long short-term memory networks (LSTMs)
[9], have been developed. Nevertheless, the over-fitting problem caused by a significant number of network parameters may still have an impact on the effectiveness of these deep learning-based AMC techniques
[10].
Figure 1 Global processing flow of deep learning-based AMC |
Full size|PPT slide
2.3 Ensemble Learning for AMC
With notable performance across a range of domains, ensemble learning has become a potent paradigm in machine learning. Combining predictions from various models is the idea behind improving overall performance, which leads to increased accuracy and resilience
[11].
Because ensemble models can handle the complex and dynamic character of communication signals, they have gained interest in the context of Automatic Modulation Classification (AMC). Ensemble models incorporate information from several sources, which allows them to capture complex patterns present in different forms of modulation and different SNR situations
[12, 13]. Using ensemble models in the context of AMC has various benefits. They perform well with a variety of modulation styles, adjust well to changes in signal-to-noise ratio, and yield higher classification accuracy results. Innovative architectures and approaches are among the latest developments in ensemble models for AMC. Prominent instances comprise deep learning ensemble models that utilize architectures like recurrent neural networks (RNNs), convolutional neural networks (CNNs), and deep neural networks (DNNs). These models show promise for lowering computational complexity and increasing classification accuracy
[14].
Notwithstanding their achievements, there are still difficulties in creating efficient group models for AMC. Achieving the ideal balance between coherence and model variation is essential. Furthermore, there is continuous research being conducted to solve overfitting-related concerns and guarantee that ensemble models are generalizable across various signal circumstances. The progress and difficulties described in the literature on ensemble models in AMC serve as a motivation for the ensemble model that this study proposes. In order to take use of complementary characteristics, ResNet and Transformer neural networks were combined. Existing ensemble models in AMC have gaps and room for development, as shown by a rigorous examination. By combining cutting-edge architectures and improving the ensemble learning procedure for more efficient modulation categorization, the suggested model seeks to close these gaps.
3 The Proposed Approach
The novel method for modulation classification that we present in this section makes use of an ensemble of two potent neural network models: Transformer Neural Network (TNN) and Residual Network (ResNet), one of which is tuned to accurately predict signals with low SNRs and the other for signals with high SNRs. Optimizing each model to perform well under certain Signal-to-Noise Ratio (SNR) situations is the main goal in order to meet the obstacles caused by variable SNRs. Using the complimentary qualities of TNN, which is skilled at managing sequential data and capturing temporal dependencies, and ResNet, which is skilled at spatial feature extraction, the ensemble architecture seeks to maximize performance.
Figure 2 Proposed ensemble model (Resnet with TNN) |
Full size|PPT slide
3.1 High SNR Model: Residual Net
Microsoft researchers introduced the ResNet (Residual Network) architecture, a kind of convolutional neural network (CNN), in 2015
[15]. ResNet's main novelty is its use of “residual connections”, which let the network learn an implicit mapping from the input to the output instead of an explicit one. Because of this, training far deeper networks than before is now feasible without running into the vanishing gradient problem and maintaining acceptable performance. ResNet is applicable for AMC and has been employed to attain cutting-edge outcomes on a range of image classification tasks
[16].
There are two primary components of the ResNet architecture: The residual unit and the residual stack. A series of residual units, each including numerous levels, makes up the residual stack. Learning a residual function with respect to the layer inputs is the responsibility of the residual stack. In order to do this, first add the input of one layer to the output of that layer before passing it through the subsequent layer. It is in charge of deepening the network and enabling it to take in complex features from the data. At the heart of the ResNet architecture is the residual unit. The first layer's output is added to the second layer's input in this configuration, which consists of two or more convolutional layers.
The choice to use Resnet for high SMR is due that it helps to save information from the source and allows the network to learn a residual function. The residual unit also includes a batch that normalize the layer, which is used to normalize the output of the convolutional and im-prove the stability of the network.
3.2 Low SNR Model: Transformer Neural Network
The sequence-to-sequence tasks in long-range dependencies can be solved with ease using the TNN architecture
[17]. Transformer models detect the influence and dependency of distant data items by using a series of mathematical operations called attention or self-attention. The network can concentrate on the most crucial components of the signal, such as the signal of interest, and ignore the noise by using attention mechanisms to weight the various input signal components differently. By creating a weighting coefficient to weight the input to sum up for a specific objective, this mechanism's primary job is to identify which features in the input are significant for the target and which ones are not.
There are multiple layers in the Transformer neural network architecture, including encoding and decoding. Multiple layers of feed-forward and self-attention neural networks make up the encoder. When generating predictions, the model is able to balance the significance of various input components thanks to the self-attention process. The output of the self-attention layer is processed using a feed-forward neural network. Additionally, feed-forward neural networks and numerous layers of self-attention make up the decoder. In addition, the decoder employs a technique known as “masked self-attention”, which stops the model from “peeping” forward in the input sequence to make predictions. Additionally, the transformer architecture has a Multi-Head Attention mechanism that enhances the model's comprehension of the input by allowing it to attend to different sections of the input simultaneously. It is computationally efficient and extremely parallelizable. The following architecture is employed:
● The transformer block, which adds non-linearity to the model and has 256 nodes in a feed-forward neural network (FFN), increases the model's capacity.
● Global Average Pooling: one average value from each tensor in the input.
● In order to avoid overfitting, Alpha Dropout (0.3) randomly removes a specific percentage of the activations. It does this by holding the input's mean and variance constant at their initial values.
● Two fully connected network along with Alpha Dropout (0.2): the activation function applied is SeLU that stands for Scaled Exponential Linear Unit.
The transformer neural network has been chosen for low SNR signals as it is able to handle sequential data such as time series, and also it has been shown to be effective in tasks that require understanding the context and dependencies among different inputs. Indeed, our method involves using a transformer encoder to extract features from a low SNR signal, which are then used by a transformer decoder to reconstruct the signal. The encoder and decoder are jointly trained to reduce the error of reconstitution between input and out-put signals.
3.3 Ensemble Model Integration
The ensemble combo put out in this paper makes use of the synergies between Transformer Neural Network (TNN) and Residual Network (ResNet), two distinct deep learning architectures. The purpose of this integration is to take use of TNN's efficiency in managing temporal dependencies and sequential data, as well as ResNet's skill in capturing spatial information.
ResNet performs exceptionally well at differentiating modulation signals in clean, noise-free situations because it is optimized for high Signal-to-Noise Ratio (SNR) environments. ResNet's spatial feature extraction output becomes a critical input for a smooth ensemble integration. With the help of spatial features, the model is trained to generate predictions. On the other hand, the Transformer Network can capture temporal dependencies because it is designed for low SNR conditions and can process sequential data with ease. In the ensemble, the TNN's output, enriched with its self-attention mechanisms, contributes predictions based on sequential patterns in the signal. The TNN's output, enhanced by its self-attention mechanisms, adds predictions to the ensemble based on signal sequences.
Our ensemble model is unique in that it can forecast simultaneously using both ResNet and TNN. Every model processes the incoming signal independently and produces a forecast. Then, using the ensemble decision-making mechanism, the final output is chosen based on which of the two predictions is the maximum. This tactic guarantees that the ensemble gains from the advantages of both models, offering a reliable and flexible classification method. The models are jointly optimized during the ensemble's joint training. This entails combining the decision-making process that chooses the maximum prediction with the optimization of ResNet and TNN's parameters. The method is adaptable as the ensemble learns to dynamically adapt to varied difficulties provided by different SNR circumstances. The decision-making process and the dual forecasts are supported by an improved ensemble model's architecture. To simplify information flow between ResNet and TNN while maintaining their distinct contributions to the overall classification process, further layers and links are added.
Selecting the maximum prediction determines the final output of the proposed ensemble model, which uniquely combines the simultaneous predictions of TNN and ResNet. By using a dynamic approach, the ensemble is made more resilient and able to leverage the advantages of both models, which improves the accuracy of modulation categorization under a variety of SNR situations.
4 Experimental Results
4.1 Experimental Setting: Dataset Selection and Characteristics
In order to verify the effectiveness of our suggested approach, we have assembled an extensive dataset comprising both artificial and actual data. This dataset includes both synthetic and simulated channel effects, and it is thoughtfully constructed to cover a wide variety of modulation settings.
4.1.1 Dataset Composition
The dataset consists of the following key components:
● Synthetic Data: To capture the complexity of real-world communication, our synthetic dataset includes twenty-four distinct modulation kinds. Among these are, notably, high-order modulations that are common in high-SNR low-fading channel settings.
● Real Gathered Data: We included 44, 876 real-world gathered frames, each of which represented a distinct modulation at a range of noise levels, to further improve the realism of our dataset. Real-world noise presents difficulties that resemble real-world communication situations.
4.1.2 Dataset Structure
The dataset is structured as follows:
● Size: Our dataset is comprised of 2, 600, 780 samples overall, which guarantees a strong representation of various modulation scenarios.
● Split: To guarantee an objective assessment of the model, we divided the dataset into training (80%) and testing (20%) sets while preserving a balanced distribution.
4.1.3 Modulation Types
Our dataset covers a spectrum of modulation types (See Figures 3 and 4), including:
Figure 3 PSK4 modulations |
Full size|PPT slide
Figure 4 4 DPSK modulation in constellation representation |
Full size|PPT slide
● PSK Modulations: QPSK, 8PSK, 16PSK, 32PSK, 16APSK, 32APSK, 64APSK, 128APSK.
● QAM Modulations: 16QAM, 32QAM, 64QAM, 128QAM, 256QAM.
● Others: AM-SSB-WC (Amplitude Modulation - Single Sideband - Wideband Carrier).
4.1.4 Synthetic Dataset Details
The synthetic dataset is characterized by:
● SNR Levels: Featuring twenty-six levels of Signal-to-Noise Ratio (SNR) for each modulation type, providing a comprehensive range of noise conditions.
● Frame Composition: Comprising 4, 096 frames for each modulation-SNR combi-nation, with each frame containing 1, 024 complex time-series samples.
● Data Format: Samples are represented as floating-point in-phase and quadrature (I/Q) components.
4.1.5 Real Dataset Characteristics
The real dataset introduces realistic challenges:
● Frame Count: 44, 876 frames total, each of which represents a distinct modulation in the presence of ambient noise.
● Classification Challenges: The real dataset's noise component makes modulation classification more challenging, mirroring the complexity seen in real-world applications.
● The integration of synthetic and actual datasets resulted in a cohesive collection including 2, 600, 780 samples, which provides a comprehensive portrayal of various modulation scenarios.
4.1.6 Technical Implementation
Tensorflow is used as the backend for all neural network implementations built using Keras, guaranteeing a stable and uniform framework for model construction and assessment. To summarise, the content, structure, and incorporation of synthetic and real-world data within our dataset provide a strong basis for assessing the effectiveness of our suggested ensemble model in a range of authentic scenarios.
4.2 Results for high SNR (ResNet)
Following a series of tests, it was found that the high signal-to-noise ratio (SNR) dataset could be classified with nearly perfect accuracy using the ResNet model. The model's maximum accuracy, which it obtained at 30 dB, was 95.9% (Figure 5). However, the categorization across low SNR signals was very poor (35% for dB). This is caused by the noise effect and is also connected to specific modulation signals that are undoubtedly harder to categorize because of their unique signal properties. This deep learning model is resilient and generalizable for predicting high SNR signals rather than those in low SNR situations, as evidenced by the consistency of our results across all test cases (Figures 6~8). Thus, the accuracy of our proposed ResNet-based method outperforms existing methods when compared to other state-of-the-art techniques for high SNR situations. This demonstrates the promise of our approach as a dependable fix for jobs involving automatic modulation classification under high SNR circumstances.
Figure 5 Confusion matrix of the ResNet model at 30 dB SNR |
Full size|PPT slide
Figure 6 Confusion matrix of the ResNet model at 20dB SNR |
Full size|PPT slide
Figure 7 Confusion matrix of the ResNet model at 14dB SNR |
Full size|PPT slide
Figure 8 Confusion matrix of the ResNet model at 26dB SNR |
Full size|PPT slide
4.3 Results for Low SNR (Transformer Network)
After extensive testing under very low SNR settings, the Transformer Network (TNN) demonstrated its remarkable performance. With an exceptional accuracy rate of 72.6%, the model demonstrated a remarkable capacity to both precisely recognize and recreate input signals. This result represents a significant improvement over earlier approaches that found it difficult to achieve high accuracy rates under low signal-to-noise ratio circumstances. The Transformer Net-work is a game-changer in the field of modulation categorization under unfavorable signal-to-noise scenarios because of its effectiveness in such demanding environments. The Transformer Network's superiority is immediately obvious when compared to the ResNet model, which had trouble recognizing signals in low SNR situations (ranging from 20 dB to 0 dB) and only managed a maximum accuracy of 20%. This striking difference emphasizes how conventional deep learning models, like ResNet, are inherently limited when it comes to signal processing in low signal-to-noise ratio settings. The architectural elements of the Transformer Network, including the inclusion of self-attention mechanisms, are responsible for its resilience in low signal-to-noise ratio situations. These processes enable the model to successfully filter out noise while choosing focusing on pertinent components of the incoming signal. The Transformer Network exhibits a distinct resistance to the difficulties presented by low SNR settings by intelligently attending to considerable portions of the signal, which leads to a notable improvement in classification accuracy. Promising implications for real-world applications arise from the Transformer Network's effectiveness in low SNR environments, especially in communication systems where noise interference is a major concern. Because of its capacity to maneuver in complex signal settings, the model is a useful tool for modulation classification tasks in situations when maintaining signal integrity in low SNR is crucial.
In summary, the Transformer Network represents a major advancement in the creation of reliable and accurate modulation classification models, especially when dealing with difficult noise-filled communication channels, thanks to its exceptional low SNR performance and architectural advantages.
4.4 Results of Ensemble Model
The deep ensemble learning model's experimental findings, shown in Figures 9~12, provide a thorough understanding of the model's functionality across a range of Signal-to-Noise Ratios (SNRs). The selected architecture performs better than baseline models on a constant basis, exhibiting superior performance in low and high SNR situations. Considering various SNR circumstances, it is evident that the suggested ensemble design performs exceptionally well in achieving higher overall accuracy. Figure 11 and Figure 10 show how well the model performs in low SNR settings, and Figures 12 and Figure 9 show how well it performs in high SNR settings.
Figure 9 Confusion matrix of the ensemble model at 30dB SNR |
Full size|PPT slide
Figure 10 Confusion matrix of the ensemble model at 20dB SNR |
Full size|PPT slide
Figure 11 Confusion matrix of the ensemble model at 18dB SNR |
Full size|PPT slide
Figure 12 Confusion matrix of the ensemble model at 24dB SNR |
Full size|PPT slide
These results highlight the effectiveness of the ensemble learning strategy in improving the model's accuracy and stability under a range of SNR circumstances. The ensemble model's potential to adapt and perform optimally under various signal difficulties is demonstrated by its persistent ability to outperform individual baseline models. Interestingly, our findings show an interesting trend: the ensemble model's classification performance is almost 50% better than the single baseline model, ResNet, at decreasing signal-to-noise ratios (SNR). This significant improvement in performance in low signal-to-noise ratio (SNR) circumstances emphasizes ensemble learning's innate ability to reduce noise and enhance classification accuracy in situations where signal clarity is impaired. The ensemble model's observed performance has important ramifications for modulation categorization tasks in real-world communication contexts. Because the model can retain high accuracy under a variety of SNR situations, it is a reliable solution for real-world scenarios where signal quality can fluctuate greatly. Therefore the results of our model is a new achievement in term of accuracy regarding high SNR signals and above all low SNR signals. It is the best rate of accuracy in the state-of-the-art in low SNR. These results are due to the combination of two algorithms and the use of TNN in noisy environment. In addition, the quality of the data set has made this work very near to reality, due to the use of real gathered data and not only synthetic data.
In summary, the improved performance of the ensemble learning model at various SNR levels indicates its flexibility and robustness against a range of signal difficulties. These findings support the use of ensemble learning as a useful technique to raise the stability and precision of modulation classification models, especially in dynamic communication contexts with high SNR fluctuations.
4.5 Advantages in Practical Applications
To clarify the advantages of our chosen models in real applications, we consider the following considerations:
4.5.1 ResNet in High SNR Environments
Obtaining Detailed Captures of Spatial Features: ResNet is particularly good at extracting spatial characteristics from high-dimensional data, which contributes to its efficacy in high Signal-to-Noise Ratio (SNR) circumstances. The network can discover complex structures and patterns in the data thanks to the architecture's inventive use of residual connections. Extracting and understanding spatial data is where ResNet shines in modulation classification problems with high SNR and little fading. This ability is essential for precisely differentiating modulation signals in clean, noise-free environments.
Sturdy Signal Separation in Noise-Free Environments: ResNet exhibits a remarkable capacity to recognize minor differences in modulation signals in clean surroundings with high SNR. When signal clarity is critical, the model's ability to maneuver across complex spatial patterns guarantees a high degree of accuracy in recognizing modulation schemes, adding to its dependability. ResNet is a solid option for applications where the integrity of the transmitted signal is crucial, including in high-quality communication channels, because of its robustness in noise-free environments.
Relevance in High-SNR, Low-Fading Real-World Channels: Moreover, ResNet is suitable for high-SNR, low-fading channel situations encountered in real life. It works effectively in situations where the signal intensity is constantly strong because of its flexibility in handling different signal complexities. Its versatility makes it more useful in communication systems where keeping a high signal-to-noise ratio (SNR) is crucial. This allows for dependable performance under circumstances similar to those seen in steady communication channels.
4.5.2 TNN in Low SNR Environments
Managing Sequential Data Accurately: In low Signal-to-Noise Ratio (SNR) conditions, the Transformer Neural Network (TNN) shows promise as a reliable solution for modulation classification tasks. One of its strongest points is how well it handles sequential data, which is very useful in situations with low signal-to-noise ratios. The attention-based architecture of TNN makes it possible for it to precisely analyze sequential input signals, which facilitates the efficient extraction of temporal connections.
Focusing Only on Important Signal Components: The ability of TNN's unique attention mechanisms to selectively focus on pertinent portions of the input signal gives the model this ability. TNN's capacity to identify and rank informative signal portions is useful in low SNR environments, where noise can obscure important signal components. This narrow focus makes the model more resilient to noise interference and improves its classification accuracy for modulation schemes in difficult, low signal-to-noise ratio settings.
Ability to Adjust to Noisy Communication Channels in the Real World: Noise and interference filled real-world communication channels can benefit from TNN's capability for modulation classification under low SNR circumstances. Because of its attention mechanisms and capacity to handle sequential data, TNN is positioned as a promising option for applications where noise-induced signal degradation is a common problem. The model's flexibility in these kinds of noisy communication channels emphasizes its applicability in real-world situations where signal clarity varies. Essentially, our choice was determined by carefully weighing the benefits and drawbacks of both ResNet and Transformer Neural Networks. ResNet performs better in situations with high SNR than TNN does in those with low SNR. Through model fusion, the ensemble leverages each model's unique strengths to generate a robust solution with enhanced accuracy and stability over a wide range of SNR circumstances.
5 Conclusion
With the advancement of Artificial Intelligence, encompassing Deep Learning, neural networks, and other technologies, automated modulation classification (AMC), a fundamental component of communication signal processing, has become more and more important in fields like cognitive electronic warfare and cognitive radio (CR). Its main objective is to correctly categorize the received signals' modulated modes. This research presents an end-to-end deep learning model for classifying modulation signals that integrates the prediction ability of many features and improves model stability through the use of an ensemble learning network. Techniques for ensemble learning are frequently used to handle multi-class classification issues and improve classification accuracy overall. These techniques function by making features more functional and by highlighting each model. In order to create a solid algorithmic framework with great adaptability, our method involves utilizing the advantages of two deep learning architectures, ResNet and Transformer network, and learning from one another. We have shown through our tests that the deep ensemble method that has been suggested may accomplish stable and accurate classification recognition at both high and low SNRs. For future works, we will try to enlarge the dataset as much as possible with some tuning regarding the used algorithms. Further projects regarding this domain of Communications Intelligence will try to find an automatic solution for classification of coding methods used over the communications signals.
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}