The next step in dataset complexity at the CSP Blog: cochannel signals.
I’ve developed another data set for use in assessing modulation-recognition algorithms (machine-learning-based or otherwise) that is more complex than the original sets I posted for the ML Challenge (CSPB.ML.2018 and CSPB.ML.2022). Half of the new dataset consists of one signal in noise and the other half consists of two signals in noise. In most cases the two signals overlap spectrally, which is a signal condition called cochannel interference.
Neural networks with CSP-feature inputs DO generalize in the modulation-recognition problem setting.
In some recently published papers (My Papers [50,51]), my ODU colleagues and I showed that convolutional neural networks and capsule networks do not generalize well when their inputs are complex-valued data samples, commonly referred to as simply IQ samples, or as raw IQ samples by machine learners.(Unclear why the adjective ‘raw’ is often used as it adds nothing to the meaning. If I just say Hey, pass me those IQ samples, would ya?, do you think maybe he means the processed ones? How about raw-I-mean–seriously-man–I-did-not-touch-those-numbers-OK? IQ samples? All-natural vegan unprocessed no-GMO organic IQ samples?Uncooked IQ samples?) Moreover, the capsule networks typically outperform the convolutional networks.
In a new paper (MILCOM 2022: My Papers ; arxiv.org version), my colleagues and I continue this line of research by including cyclic cumulants as the inputs to convolutional and capsule networks. We find that capsule networks outperform convolutional networks and that convolutional networks trained on cyclic cumulants outperform convolutional networks trained on IQ samples. We also find that both convolutional and capsule networks trained on cyclic cumulants generalize perfectly well between datasets that have different (disjoint) probability density functions governing their carrier frequency offset parameters.
That is, convolutional networks do better recognition with cyclic cumulants and generalize very well with cyclic cumulants.
So why don’t neural networks ever ‘learn’ cyclic cumulants with IQ data at the input?
The majority of the software and analysis work is performed by the first author, John Snoap, with an assist on capsule networks by James Latshaw. I created the datasets we used (available here on the CSP Blog [see below]) and helped with the blind parameter estimation. Professor Popescu guided us all and contributed substantially to the writing.
Back in 2018 I posted a dataset consisting of 112,000 I/Q data files, 32,768 samples in length each, as a part of a challenge to machine learners who had been making strong claims of superiority over signal processing in the area of automatic modulation recognition. One part of the challenge was modulation recognition involving eight digital modulation types, and the other was estimating the carrier frequency offset. That dataset is described here, and I’d like to refer to it as CSPB.ML.2018.
Then in 2022 I posted a companion dataset to CSPB.ML.2018 called CSPB.ML.2022. This new dataset uses the same eight modulation types, similar ranges of SNR, pulse type, and symbol rate, but the random variable that governs the carrier frequency offset is different with respect to the random variable in CSPB.ML.2018. The purpose of the CSPB.ML.2022 dataset is to facilitate studies of the dataset-shift, or generalization, problem in machine learning.
Throughout the past couple of years I’ve been working with some graduate students and a professor at Old Dominion University on merging machine learning and signal processing for problems involving RF signal analysis, such as modulation recognition. We are starting to publish a sequence of papers that describe our efforts. I briefly describe the results of one such paper, My Papers , in this post.
Does the use of ‘total SNR’ mislead when the fractional bandwidth is very small? What constitutes ‘weak-signal processing?’
Or maybe “Comments on” here should be “Questions on.”
In a recent paper in EURASIP Journal on Advances in Signal Processing (The Literature [R165]), the authors tackle the problem of machine-learning-based modulation recognition for highly oversampled rectangular-pulse digital signals. They don’t use the DeepSig datasets (one, two, three, four), but their dataset description and use of ‘signal-to-noise ratio’ leaves a lot to be desired. Let’s take a brief look. See if you agree with me that the touting of their results as evidence that they can reliably classify signals with ‘SNRs of dB’ is unwarranted and misleading.
An analysis of DeepSig’s 2016.10A data set, used in many published machine-learning papers, and detailed comments on quite a few of those papers.
Update March 2021
See my analyses of three other DeepSig datasets here, here, and here.
Update June 2020
I’ll be adding new papers to this post as I find them. At the end of the original post there is a sequence of date-labeled updates that briefly describe the relevant aspects of the newly found papers. Some machine-learning modulation-recognition papers deserve their own post, so check back at the CSP Blog from time-to-time for “Comments On …” posts.
We continue our progression of Signal-Processing ToolKit posts by looking at the frequency-domain behavior of linear time-invariant (LTI) systems. In the previous post, we established that the time-domain output of an LTI system is completely determined by the input and by the response of the system to an impulse input applied at time zero. This response is called the impulse response and is typically denoted by .
What are the unique parts of the multidimensional cyclic moments and cyclic cumulants?
In this post, we continue our study of the symmetries of CSP parameters. The second-order parameters–spectral correlation and cyclic correlation–are covered in detail in the companion post, including the symmetries for ‘auto’ and ‘cross’ versions of those parameters.
Here we tackle the generalizations of cyclic correlation: cyclic temporal moments and cumulants. We’ll deal with the generalization of the spectral correlation function, the cyclic polyspectra, in a subsequent post. It is reasonable to me to focus first on the higher-order temporal parameters, because I consider the temporal parameters to be much more useful in practice than the spectral parameters.
This topic is somewhat harder and more abstract than the second-order topic, but perhaps there are bigger payoffs in algorithm development for exploiting symmetries in higher-order parameters than in second-order parameters because the parameters are multidimensional. So it could be worthwhile to sally forth.
There are some situations in which the spectral correlation function is not the preferred measure of (second-order) cyclostationarity. In these situations, the cyclic autocorrelation (non-conjugate and conjugate versions) may be much simpler to estimate and work with in terms of detector, classifier, and estimator structures. So in this post, I’m going to provide surface plots of the cyclic autocorrelation for each of the signals in the spectral correlation gallery post. The exceptions are those signals I called feature-rich in the spectral correlation gallery post, such as DSSS, LTE, and radar. Recall that such signals possess a large number of cycle frequencies, and plotting their three-dimensional spectral correlation surface is not helpful as it is difficult to interpret with the human eye. So for the cycle-frequency patterns of feature-rich signals, we’ll rely on the stem-style (cyclic-domain profile) plots that I used in the spectral correlation gallery post.
Learning machine learning for radio-frequency signal-processing problems, continued.
I continue with my foray into machine learning (ML) by considering whether we can use widely available ML tools to create a machine that can output accurate power spectrum estimates. Previously we considered the perhaps simpler problem of learning the Fourier transform. See here and here.
Along the way I’ll expose my ignorance of the intricacies of machine learning and my apparent inability to find the correct hyperparameter settings for any problem I look at. But, that’s where you come in, dear reader. Let me know what to do!
A PSK/QAM/SQPSK data set with randomized symbol rate, inband SNR, carrier-frequency offset, and pulse roll-off.
Update February 2023: I’ve posted a third challenge dataset here. It is CSPB.ML.2023 and features cochannel signals.
Update April 2022. I’ve also posted a second dataset here. This new dataset is similar to the original ML Challenge dataset except the random variable representing the carrier frequency offset has a slightly different distribution.
If you refer to either of the posted datasets in a published paper, please use the following designators, which I am also using in papers I’m attempting to publish:
Update September 2020. I made a mistake when I created the signal-parameter “truth” files signal_record.txt and signal_record_first_20000.txt. Like the DeepSig RML data sets that I analyzed on the CSP Blog here and here, the SNR parameter in the truth files did not match the actual SNR of the signals in the data files. I’ve updated the truth files and the links below. You can still use the original files for all other signal parameters, but the SNR parameter was in error.
Update July 2020. I originally posted signals in the posted data set. I’ve now added another for a total of signals. The original signals are contained in Batches 1-5, the additional signals in Batches 6-28. I’ve placed these additional Batches at the end of the post to preserve the original post’s content.
The statistics-oriented wing of electrical engineering is perpetually dazzled by [insert Revered Person]’s Theorem at the expense of, well, actual engineering.
I recently came across the conference paper in the post title (The Literature [R101]). Let’s take a look.
The paper is concerned with “detect[ing] the presence of ACS signals with unknown cycle period.” In other words, blind cyclostationary-signal detection and cycle-frequency estimation. Of particular importance to the authors is the case in which the “period of cyclostationarity” is not equal to an integer number of samples. They seem to think this is a new and difficult problem. By my lights, it isn’t. But maybe I’m missing something. Let me know in the Comments.
The machine-learning modulation-recognition community consistently claims vastly superior performance to anything that has come before. Let’s test that.
Update February 2023: A third dataset has been posted here. This new dataset, CSPB.ML.2023, features cochannel signals.
Update April 2022: I’ve also posted a second dataset here. This new dataset is similar to the original ML Challenge dataset except the random variable representing the carrier frequency offset has a slightly different distribution.
If you refer to any of the posted datasets in a published paper, please use the following designators, which I am also using in papers I’m attempting to publish:
I’ve decided to post the data set I discuss here to the CSP Blog for all interested parties to use. See the new post on the Data Set. If you do use it, please let me and the CSP Blog readers know how you fared with your experiments in the Comments section of either post. Thanks!
An alternative to the strip spectral correlation analyzer.
Let’s look at another spectral correlation function estimator: the FFT Accumulation Method (FAM). This estimator is in the time-smoothing category, is exhaustive in that it is designed to compute estimates of the spectral correlation function over its entire principal domain, and is efficient, so that it is a competitor to the Strip Spectral Correlation Analyzer (SSCA) method. I implemented my version of the FAM by using the paper by Roberts et al (The Literature [R4]). If you follow the equations closely, you can successfully implement the estimator from that paper. The tricky part, as with the SSCA, is correctly associating the outputs of the coded equations to their proper values.
How do we efficiently estimate higher-order cyclic cumulants? The basic answer is first estimate cyclic moments, then combine using the moments-to-cumulants formula.
In this post we discuss ways of estimating -th order cyclic temporal moment and cumulant functions. Recall that for , cyclic moments and cyclic cumulants are usually identical. They differ when the signal contains one or more finite-strength additive sine-wave components. In the common case when such components are absent (as in our recurring numerical example involving rectangular-pulse BPSK), they are equal and they are also equal to the conventional cyclic autocorrelation function provided the delay vector is chosen appropriately. That is, the two-dimensional delay vector is set equal to .
The more interesting case is when the order is greater than two. Most communication signal models possess odd-order moments and cumulants that are identically zero, so the first non-trivial order greater than two is four. Our estimation task is to estimate -th order temporal moment and cumulant functions for using a sampled-data record of length .
Gaussian and binary signals are in some sense at opposite ends of the pure-impure sine-wave spectrum.
Remember when we derived the cumulant as the solution to the pure th-order sine-wave problem? It sounded good at the time, I hope. But here I describe a curious special case where the interpretation of the cumulant as the pure component of a nonlinearly generated sine wave seems to break down.
Spread-spectrum signals are used to enable shared-bandwidth communication systems (CDMA), precision position estimation (GPS), and secure wireless data transmission.
In this post we look at direct-sequence spread-spectrum (DSSS) signals, which can be usefully modeled as a kind of PSK signal. DSSS signals are used in a variety of real-world situations, including the familiar CDMA and WCDMA signals, covert signaling, and GPS. My colleague Antonio Napolitano has done some work on a large class of DSSS signals (The Literature [R11, R17, R95]), resulting in formulas for their spectral correlation functions, and I’ve made some remarks about their cyclostationary properties myself here and there (My Papers ).
Let’s talk about another published paper on signal detection involving cyclostationarity and/or cumulants. This one is called “Energy-Efficient Processor for Blind Signal Classification in Cognitive Radio Networks,” (The Literature [R69]), and is authored by UCLA researchers E. Rebeiz and four colleagues.
My focus on this paper is its idea that broad signal-type classes, such as direct-sequence spread-spectrum (DSSS), QAM, and OFDM can be reliably distinguished by the use of a single number: the fourth-order cumulant with two conjugated terms. This kind of cumulant is referred to as the cumulant here at the CSP Blog, and in the paper, because the order is and the number of conjugated terms is .
The spectral parameters of HOCS have not proven to be as useful as the temporal parameters unless you include the trivial case where the moment/cumulant order is equal to two. In that case, the spectral parameters reduce to the spectral correlation function, which is extremely useful in CSP (see the TDOA and signal-detection posts for examples).
I recently came across a published paper with the title Cyclostationary Correntropy: Definition and Application, by Aluisio Fontes et al. It is published in a journal called Expert Systems with Applications (Elsevier). Actually, it wasn’t the first time I’d seen this work by these authors. I had reviewed a similar paper in 2015 for a different journal.
I was surprised to see the paper published because I had a lot of criticisms of the original paper, and the other reviewers agreed since the paper was rejected. So I did my job, as did the other reviewers, and we tried to keep a flawed paper from entering the literature, where it would stay forever causing problems for readers.
The editor(s) of the journal Expert Systems with Applications did not ask me to review the paper, so I couldn’t give them the benefit of the work I already put into the manuscript, and apparently the editor(s) did not themselves see sufficient flaws in the paper to merit rejection.
It stings, of course, when you submit a paper that you think is good, and it is rejected. But it also stings when a paper you’ve carefully reviewed, and rejected, is published anyway.
Fortunately I have the CSP Blog, so I’m going on another rant. After all, I already did this the conventional rant-free way.
I came across a paper by Cohen and Eldar, researchers at the Technion in Israel. You can get the paper on the Arxiv site here. The title is “Sub-Nyquist Cyclostationary Detection for Cognitive Radio,” and the setting is spectrum sensing for cognitive radio. I have a question about the paper that I’ll ask below.